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METHODS AND COMPOSITIONS FOR MODULATING TYPE I MUSCLE 

FORMATION USING PGC-la 

Government Support 

5 Work described herein was supported under grant DK54477 awarded by the 

National Institutes of Health. The U.S. government may have certain rights in this 
invention. 

Related Applications 

1 0 This application claims priority to U.S. provisional Application No. 60/357,069, 

filed on February 13, 2002, incorporated herein in it's entirety by this reference. 

Background of the Invention 

The metabolic properties of muscle are profoundly influenced by exercise and 
1 5 disease. Long-term endurance exercise training or low frequency motor nerve 
stimulation promote the transition toward an oxidative metabolism with enhanced 
mitochondrial biogenesis characteristic of slow (type) skeletal muscle fibers. 
Conversely, disuse atrophy, exercise intolerance associated with congestive heart failure, 
and mitochondrial myopathies result in loss of type I oxidative skeletal muscle fibers, 
20 chronic fatigue, and increased glycolytic fibers. 

Accordingly, there exists a need for additional therapeutic options which can 
modulate type I muscle formation to provide relief for symptoms of heart failure, disuse 
atrophy, mitochondrial myopathies, and systemic metabolic disorders. 

25 Summary of the Invention 

The present invention is based, at least in part, on the discovery that PGC-la 
(also known as, and used interchangeably herein with, PGC-1), regulates type I (slow- 
twitch) muscle fiber differentiation and contributes to maintaining muscle cell 
determination. Accordingly, the present invention provides methods for modulating 
30 type I muscle formation comprising contacting a cell (/. e. , a muscle cell such as a type I 
muscle cell or a type II muscle cell) with an agent that modulates PGC-la expression or 
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activity, such that type I muscle formation is modulated. In a preferred embodiment, 
PGC-la expression or activity is increased, thereby increasing type I muscle formation. 

In addition to PGC-la, several additional factors involved in the signaling 
cascades underlying muscle fiber type determination have been identified, such as the 

5 calcium/calmodulin-dependent protein kinase IV (CaMKTV) and calcineurin A (CnA). It 
has been found that the PGC-la promoter is regulated by both CaMKTV as well as CnA 
activity. CaMKTV activates PGC-la almost entirely through a binding site for cAMP 
response element binding protein (CREB), which is found in the PGC-la promoter. 
Moreover, a positive autoregulatory loop exists by which PGC-la controls its own 

1 0 transcription through binding to and coactivation of myocyte enhancer factor 2 (MEF2) 
transcription factors, e.g., MEF2C and MEF2D, which are transcription factors that are 
targets of CaMKTV and CnA signaling and that bind directly to the PGC-la promoter. 

• In one embodiment, the agent that modulates PGC-la expression or activity is a 
PGC-la nucleic acid molecule (i.e., a human PGC-la nucleic acid molecule comprising 

15 the nucleic acid sequence of SEQ ED NO:l). In another embodiment, the PGC-la 
nucleic acid molecule is contained within a vector. In yet another embodiment, the 
agent is a PGC-la polypeptide (ic f a human PGC-la polypeptide comprising the amino 
acid sequence of SEQ ID NO:2). In a further embodiment, the agent is a small 
molecule. 

20 The invention also provides methods for identifying compounds capable of modulating 
type I muscle formation comprising contacting a cell (i e. , a muscle cell such as a type I 
muscle cell or a type II muscle cell) with a compound, and determining whether PGC-la 
expression or activity is modulated. In one embodiment, PGC-la expression or activity 
is increased. In another embodiment, determining whether PGC-la expression or 

25 activity is modulated is by measuring PGC-la egression by Northern blotting. In 
another embodiment, determining whether PGC-la expression or activity is modulated 
comprises determining whether expression of at least one of myoglobin, troponin I slow, 
troponin I fast, MCAD, COX II, COX IV, or cytochrome c is modulated. In still another 
embodiment, determining whether PGC-la expression or activity is modulated 

30 comprises determining whether an MEF2 transcription factor is activated. 
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In another embodiment, the invention provides methods for identifying compounds 
capable of treating a disorder characterized by aberrant type I muscle formation (i.e., 
heart failure, disuse atrophy, mitochondrial myopathies, or systemic metabolic disorders) 
comprising identifying the ability of the compound to modulate the expression or 

5 activity of PGC-la to thereby identify a compound capable of treating a disorder 

characterized by aberrant type I muscle formation. In a preferred embodiment, PGC-la 
expression or activity is increased. In another embodiment, determining whether PGC- 
la expression or activity is modulated is by measuring PGC-la expression by Northern 
blotting. In yet another embodiment, determining whether PGC-1 a expression or 

1 0 activity is modulated comprises determining whether expression of at least one of 

myoglobin, troponin I slow, troponin I fast, MCAD, COX II, COX IV, or cytochrome c 
is modulated. In another embodiment, the invention provides compounds identified by 
the methods of the invention. In still another embodiment, detenmriing whether PGC-la 
expression or activity is modulated comprises determining whether an MEF2 

1 5 transcription factor is activated. 

The invention further provides methods for treating subjects having disorders 
characterized by aberrant type I muscle formation Q.e. 9 heart failure, disuse atrophy, 
mitochondrial myopathies, or systemic metabolic disorders), comprising administering 
to the subject an agent capable of modulating PGC-la expression or activity, such that 

20 the disorder is treated. In one embodiment, PGC-laexpression or activity is increased. 
In another embodiment, type I muscle formation is increased. In yet another 
embodiment, the agent is a PGC-la nucleic acid molecule (*.<?., a human PGC-la nucleic 
acid molecule comprising the nucleic acid sequence of SEQ ID NO:l). In a further 
embodiment, the PGC-la nucleic acid molecule is contained within a vector. In yet a 

25 . further embodiment, the agent is a small molecule. 

The invention also provides transgenic non-human animals (i.e. 9 mice, rats, 
monkeys, horses, dogs, turkeys, fish, cows, pigs, sheep, goats, frogs, chickens, etc.) 
comprising an exogenous PGC-la nucleic acid molecule, wherein the exogenous PGC- 
la nucleic acid molecule is expressed in the skeletal muscle of the non-human 

30 transgenic animal. In one embodiment, the exogenous PGC-la nucleic acid molecule is 
operatively linked to a muscle-specific promoter (e.g., the muscle creatine kinase 
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promoter, the dystrophin promoter, the myostatin promoter, the GDF-8 promoter, the 
UCP-3 promoter, the MyoD promoter, the MEF2 promoter, the myosin heavy chain 
promoter, the myosin light chain promoter, or a troponin promoter). In a preferred 
embodiment, the non-human animal is a mouse. In another preferred embodiment, the 
5 expression of at least one of myoglobin, troponin I slow, MC AD, COX n, COX IV, or 
cytochrome c is upregulated in the muscle cells of the non-human animal. 

Other features and advantages of the invention will be apparent from the 
following detailed description and claims. 

10 Brief Description of the Drawings 

Figure 1A-C depicts the induction of PGC-la by CaMKTV via CREB. (A) 
CaMKIV-activation of PGC-la can be abolished by the dominant negative ACREB. 
(B) The sequence of the CRE in the mouse PGC-la promoter (SEQ ID NO:14) and the 
human promoter (SEQ ID NO: 1 5). For a description of the CRE in the human PGC-la 

15 promoter, see Herzig, S. etal (2001) Nature 413, 179-183. The PGC-la promoter with 
a mutation in the CRE site (ACRE) is also depicted (SEQ ID NO:16). (C) Site-directed 
mutagenesis of the CRE inhibits CaMKTV-mediated activation of PGC-la. 

Figure 2A-B illustrates the coactivaiion of MEF2s on the PGC-la promoter. (A) 
MEF2C and MEF2D activate the mouse PGC-la promoter. (B) MEF2 activity is 

20 increased by CnA. 

Figure 3A-B depicts the activation of PGC-1 by MEF2C and MEF2D via a 
conserved binding site. (A) Identification of putative MEF2-binding sites in the human 
(SEQ ID NO:18) and mouse (SEQ ID NO:19) PGC-la promoter. Both promoters were 
compared to the TRANSFAC transcription factor binding sites database (Quandt, K. et 

25 al (1995) Nucleic Acids Res 23, 4878-4884) and high-scoring hits to the matrix 

V$AMEF2.01 depicted (SEQ ID NO:17). In the TRANSFAC matrix, basepairs marked 
bold are of high information content and underlined basepairs denote the core sequence. 
Putative MEF2 binding sites in the PGC-la promoters are bold and underlined. The 
PGC-la promoter with a mutation in the MEF2 binding site (AMEF2) is also depicted 
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(SEQ ID NO:20). (B) MEF2C and MEF2D activate a conserved MEF-response element 
in the PGC-la promoter. 

Figure 4A-D depicts a model for the autoregulatory loop regulating PGC-la in 
muscle fiber type determination. (B-D) Endogenous PGC-la expression is increased in 
5 transgenic mice expressing ectopic PGC-la in skeletal muscle. 

Detailed Description of the Invention 

The present invention is based, at least in part, on the discovery that PGC-la can 
modulate type I (slow-twitch) muscle formation and mitochondrial biogenesis in muscle 

10 cells, as well as contribute to main tainin g muscle cell determination. In particular, it has 
been found that PGC-la can regulate type I muscle fiber differentiation. The present 
invention is further based, at least in part, on the discovery that transgenic animals 
expressing PGC-la contain increased type I muscle fibers. Moreover, the muscles from 
these animals are more resistant to exercise-induced fatigue, a hallmark for slow-twitch 

1 5 muscle fibers and muscles following endurance training. 

PGC-la is a recently described coactivator of nuclear receptors and has been 
shown to play a major role in cellular respiration, adaptive thermogenesis, and 
gluconeogenesis in tissues such as brown fat and skeletal muscle (Puigserver, P. et al 
(1998) Cell 92:829-839; Wu, Z. et al (1999) Cell 98:1 15-124; Yoon J.C. et al (2001) 

20 Nature 413(6852):13 1-8. The discoveries of the instant invention implicate PGC-la as a 
major regulator of type I muscle formation. 

More specifically, it has been found that expression of PGC-la in the muscles of 
transgenic mice induces dose-dependant expression of type I muscle specific genes (£&, 
myoglobin and troponin I slow), as well as mitochondrial specific genes indicative of 

25 type I muscle specific mitochondrial biogenesis (£ e. , MC AD, COX II, COX IV, and 
cytochrome c). PGC-la expression in the muscles of the transgenic mice also induces a 
dose-dependant down-regulation of the expression of a type II muscle marker, troponin I 
fast. Induction of the type I specific genes (also referred to herein as 'type I markers"), 
by PGC-la in the muscles of the transgenic mice is seen in otherwise type II muscle 

30 fibers. Histological analysis indicates that the muscles of the transgenic mice have 
greater numbers of type I fibers than littennate controls, and the isolated muscle fibers 
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are more resistant to exercise-induced fatigue. The discoveries of the instant invention 
thus identify PGC-la as a major regulator of type I muscle differentiation. 

In addition to PGC-la, several additional factors involved in the signaling 
cascades underlying muscle fiber type determination have been identified, including 
5 CaMKIV and CnA. It has been found that the PGC-1 a promoter is regulated by both 
CaMKIV as well as CnA activity. Exercise and subsequently elevated intracellular 
calcium levels result in an activation of both CaMKIV and CnA in skeletal muscle. 
CaMKIV activates PGC-la almost entirely through a binding site for CREB, which is 
found in the PGC-la promoter (see Herzig, S., et al (2001) Nature 413, 179-183 for a 

1 0 description of the cAMP responsive element (CRE) in the human PGC-1 a promoter). 
Moreover, there is a positive autoregulatory loop by which PGC-la controls its own 
transcription through the binding to and coactivation of myocyte enhancer factor 2 
(MEF2) transcription factors, e.g., MEF2C and MEF2D, which are transcription factors 
that are targets of CaMKIV and CnA signaling and that bind directly to the PGC-la 

1 5 ' promoter. MEF2 transcription factors therefore can increase transcription of PGC- 1 a 
and this induction response is enhanced by the presence of PGC- 1 a. This positive 
autoregulatory loop helps to sustain high PGC-la levels and thus promotes a stable 
expression of muscle fiber type I specific genes. It has also been found that ectopic 
expression of PGC-la in the skeletal muscle of transgenic mice increased the levels of 

20 endogenous PGC-la transcript. 

These findings indicate that muscle fiber type determination may therefore 
maintain a quasi-stable state through the establishment of a regulatory loop involving 
PGC-la and MEF2 proteins. In other words, once PGC-la expression is triggered by, 
for example, exercise, PGC-la expression levels are maintained without further muscle 

25 stimuli, thereby promoting a stable expression of muscle fiber type I specific genes. 
Knowledge of these mechanisms which control the regulation and maintenance of 
muscle fiber type determination allows for tissue-specific targeting of these and other 
factors in diseases with impaired muscle formation or general muscle wasting due to 
physical inactivity. 
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The instant invention therefore provides methods and compositions for 
modulating type I muscle formation using PGC-la and modulators thereof. 
Accordingly, one aspect of the invention pertains to the use of PGC-la molecules, 
referred to herein as PGC-la nucleic acid and protein molecules, which comprise a 
5 family of molecules having certain conserved structural and functional features, and 
which play a role in or function in type I muscle formation associated activities. The 
term "family" when referring to the protein and nucleic acid molecules of the invention 
is intended to mean two or more proteins or nucleic acid molecules having a common 
structural domain and having sufficient amino acid or nucleotide sequence homology as 

1 0 defined herein. Such family members can be naturally occurring and can be from either 
the same or different species. For example, a family can contain a first protein of human 
origin, as well as other, distinct proteins of human origin or alternatively, can contain 
homologues of non-human origin. Members of a family may also have common 
functional characteristics. 

15 Another aspect of the invention pertains to methods for treating a subject, having 

a disease or disorder characterized by (or associated with) aberrant or abnormal PGC-la 
nucleic acid expression and/or PGC-la protein activity. These methods include the step .. 
of administering a PGC-la modulator to the subject such that treatment occurs. The 
language "aberrant or abnormal PGC-la expression" refers to expression of a non-wild- 

20 type PGC-la protein or a non-wild-type level of expression of a PGC-la protein. 

Aberrant or abnormal PGC-la protein activity refers to a non-wild-type PGC-la protein 
activity or a non-wild-type level of PGC-la protein activity. As the PGC-la protein is 
involved in, for example, a pathway involving type I muscle formation, aberrant or 
abnormal PGC-1 a protein activity or nucleic acid expression interferes with the normal 

25 expression of type I muscle specific genes, and/or type I muscle differentiation. 

Non-limiting examples of disorders or diseases characterized by or associated 
with abnormal or aberrant PGC-la protein activity or nucleic acid expression (also 
referred to herein as PGC-la associated disorders or as type I muscle associated 
disorders) include cardiovascular disorders (f.e., heart failure), disuse atrophy, muscle 

30 wasting (j. e, , that caused by disorders such as cancer, AIDS, or other infectious 

diseases), mitochondrial myopathies, systemic metabolic disorders (ie. 9 diabetes, insulin 



7 



WO 03/068944 



PCT/US03/04792 



resistance, hypoglycemia, obesity, body weight disorders, cachexia, or anorexia). See 
Braunwald, E. et al. eds. Harrison's Principles of Internal Medicine, Eleventh Edition 
(McGraw-Hill Book Company, New York, 1987) pp. 1778-1797; Robbins, S.L. et al 
Pathologic Basis of Disease, 3rd Edition (W.B. Saunders Company, Philadelphia, 1984) 
5 p. 972 for further descriptions of such disorders. The terms 'treating" or "treatment," as 
used herein, refer to reduction or alleviation of at least one adverse effect or symptom of 
a disorder or disease, i.e. 9 a disorder or disease characterized by or associated with 
abnormal or aberrant PGC-la protein activity or PGC-la nucleic acid expression. 

The terms 'treating" or "treatment," as used herein, further refers to increasing 

10 type I muscle formation in subjects without a type I muscle associated disorder, Le, 9 in 
subjects wherein increased type I muscle formation is desirable. For example, athletes, 
competitive racing animals, and other subjects wherein increased type I muscle 
formation would be desirable, may benefit from the methods of the present invention. 
As used herein, a PGC-la modulator is a molecule which can modulate PGC-la 

15 nucleic acid expression and/or PGC-la protein activity. For example, a PGC-la 

modulator can modulate, i.e., upregulate (activate) or downregulate (suppress), PGC-la 
nucleic acid expression. In another example, a PGC-1 a modulator can modulate (/. e. , 
stimulate or inhibit) PGC-la protein activity. If it is desirable to treat a disorder or 
disease characterized by (or associated with) aberrant or abnormal (non-wild-type) PGC- 

20 la nucleic acid expression and/or PGC-la protein activity by inhibiting PGC-la nucleic 
acid expression, a PGC-la modulator can be an antisense molecule, i.e., a ribozyme, as 
described herein. Examples of antisense molecules which can be used to inhibit PGC-la 
nucleic acid expression include antisense molecules which are complementary to a 
portion of the 5' untranslated region of SEQ ID NO: 1 or SEQ ID NO:4 which also 

25 includes the start codon and antisense molecules which are complementary to a portion 
of the 3' untranslated region of SEQ ED NO:l or SEQ ID NO:4. 

A PGC-la modulator which inhibits PGC-la nucleic acid expression can also be 
a small molecule or other drug, i e. , a small molecule or drug identified using the 
screening assays described herein, which inhibits PGC-la nucleic acid expression. A 

30 PGC-la molecule of the invention can thus also be used as a target to screen molecules, 
i.e., which can modulate PGC-la activity. 



8 



WO 03/068944 



PCT/US03/04792 



If it is desirable to treat a subject, by stimulating PGC-la nucleic acid 
expression, PGC-la modulator can be used, for example, a nucleic acid molecule 
encoding PGC-la (i.e., a nucleic acid molecule comprising a nucleotide sequence 
homologous to the nucleotide sequence of SEQ ID NO:l or SEQ ID NO:4), an active 

5 PGC-la protein or portion thereof (i.e., a PGC-la protein or portion thereof having an 
amino acid sequence which is homologous to the amino acid sequence of SEQ ID NO:2 
or SEQ ID NO:5 or a portion thereof), or a small molecule or other drug, ie., a small 
molecule (peptide) or drug identified using the screening assays described herein, which 
stimulates PGC-la nucleic acid expression and/or PGC-la protein activity. 

1 0 Alternatively, if it is desirable to treat a disease or disorder characterized by (or 

associated with) aberrant or abnormal (non-wild-type) PGC-la nucleic acid expression 
and/or PGC-la protein activity by inhibiting PGC-la protein activity, a PGC-la 
modulator can be used, such as an anti- PGC-la antibody or a small molecule or other 
drug, le. 9 a small molecule or drug identified using the screening assays described 

1 5 herein, which inhibits PGC-la protein activity. In a preferred embodiment, a PGC-la 
modulator is a PGC-la dominant negative. 

If it is desirable to treat a disease or disorder characterized by (or associated with) 
aberrant or abnormal (non-wild-type) PGC-la nucleic acid expression and/or PGC-la 
protein activity by stimulating PGC-la protein activity, a PGC-la modulator can be an 

20 active PGC-la protein or portion thereof (i.e. 9 a PGC-la protein or portion thereof 

having an amino acid sequence which is homologous to the amino acid sequence of SEQ 
ID NO:2 or SEQ ED NO:5 or a portion thereof) or a small molecule or other drug, Le., a 
small molecule or drug identified using the screening assays described herein, which 
stimulates PGC-la protein activity. 

i 

25 In addition, a subject having a type I muscle associated disorder (/. e. , heart 

failure, disuse atrophy, a mitochondrial myopathy, or a systemic metabolic disorder), can 
be treated according to the present invention by administering to the subject a PGC-la 
protein or portion thereof or a nucleic acid encoding a PGC-la protein or portion thereof 
such that treatment occurs. 
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Other aspects of the invention pertain to methods for modulating a cell associated 
activity. These methods include contacting the cell with an agent (or a composition 
which includes an effective amount of an agent) which modulates PGC-la protein 
activity or PGC-1 a nucleic acid expression such that a cell associated activity is altered 

5 relative to a cell associated activity of the cell in the absence of the agent. As used 
herein, "a cell associated activity" refers to a normal or abnormal activity or function of 
a cell Examples of cell associated activities include proliferation, migration, 
differentiation, production or secretion of molecules, such as proteins, cell survival, 
gluconeogenesis, and thermogenesis. In a preferred embodiment, the cell associated 

10 activity is type I muscle formation and the cell is a muscle cell. The term "altered" as 
used herein refers to a change, i.e., an increase or decrease, of a cell associated activity. 
In one embodiment, the agent stimulates PGC-la protein activity or PGC-la nucleic 
acid expression. Examples of such stimulatory agents include an active PGC-la protein, 
a nucleic acid molecule encoding PGC-1 a that has been introduced into the cell, and a 

1 5 modulatory agent which stimulates PGC-1 a protein activity or PGC-la nucleic acid 
expression and which is identified using the drug screening assays described herein. In 
another embodiment, the agent inhibits PGC-la protein activity or PGC-la nucleic acid 
expression. Examples of such inhibitory agents include a nucleic acid molecule 
encoding a dominant negative PGC-la protein, a dominant negative PGC-la protein, an 

20 antisense PGC-la nucleic acid molecule, an anti- PGC-la antibody, and a modulatory 
agent which inhibits PGC-la protein activity or PGC-la nucleic acid expression and 
which is identified using the drug screening assays described herein. These modulatory 
methods can be performed in vitro (i.e., by culturing the cell with the agent) or, 
alternatively, in vivo (i.e. 9 by administering the agent to a subject). In a preferred 

25 embodiment, the modulatory methods are performed in vivo, /. e. , the cell is present 

within a subject, ie. 9 a mammal, i.e. 9 a human, and the subject has a disorder or disease 
characterized by or associated with abnormal or aberrant PGC-la protein activity or 
PGC-la nucleic acid expression. 

The methods of the present invention may therefore: 1) modulate type I muscle 

30 formation; 2) modulate the conversion of type II muscle fibers into type I muscle fibers; 
3) modulate the response of muscle fibers to exercise induced fatigue; 4) treat diseases or 
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disorders characterized by aberrant PGC-la expression or activity, ie., heart failure, 
disuse atrophy, mitochondrial myopathy, and/or systemic metabolic disease; 5) modulate 
the expression of myoglobin, troponin I slow, troponin I fast, MCAD, COX II, COX IV, 
and/or cytochrome c; and/or 6) modulate coactivation of MEF2 transcription factors, 
5 e.g., MEF2C and MEF2D. 

A nucleic acid molecule, a protein, a PGC-la modulator, a compound etc. used 
in the methods of treatment can be incorporated into an appropriate pharmaceutical 
composition described herein and administered to the subject through a route which 
allows the molecule, protein, modulator, or compound etc. to perform its intended 
1 0 function. Examples of routes of administration are also described herein. 

The nucleotide sequence of the human PGC-la cDNA and the predicted amino 
acid sequence of the human PGC-la protein are shown in SEQ ID NOs:l and 2, 
respectively. The human PGC-la gene, which is approximately 3023 nucleotides in 
length, encodes a full length protein having a molecular weight of approximately 120 kD 
1 5 and which is approximately 798 amino acid residues in length. Further description of 
the human PGC-la nucleic acid and polypeptide sequences can be found in PCT 
International Publication No. WO 00/32215, incorporated herein by reference. 

The nucleotide sequence of the mouse PGC-la cDNA and the predicted amino 
acid sequence of the mouse PGC-la protein are shown in SEQ ID NOs:4 and 5, 
20 respectively. The mouse PGC-la gene, which is approximately 3066 nucleotides in 

length, encodes a full length protein having a molecular weight of approximately 120 kD 
and which is approximately 797 amino acid residues in length. Further description of 
the mouse PGC-la nucleic acid and polypeptide sequences can be found in PCT 
International Publication Nos. WO 00/32215 and WO 98/54220, U.S. Patent No. 
25 6,166,192, Puigserver, P. et al. (1998) Cell 92(6):829-39, all of which are incorporated 
herein by reference. 

PGC-la family member proteins include several domains/motifs. These 
domains/motifs include: two putative tyrosine phosphorylation sites (amino acid 
residues 205-213 and 379-386 of SEQ ID NO:2, and amino acid residues 204-212 and 
30 378-385 of SEQ ED NO:5), three putative cAMP phosphorylation sites (amino acid 
residues 239-242, 374-377, and 656-658 of SEQ ID NO:2, and 238-241, 373-376, and 
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655-658 of SEQ ID NO:5), a serine-arginine (SR) rich domain (amino acid residues 563- 
601 of SEQ ID NO:2, and 562-600 of SEQ ID NO:5), an RNA binding motif (amino 
acid residues 657-710 of SEQ ID NO:2, and 656-709 of SEQ ID NO:5), and an LXXLL 
motif (amino acid residues 144-148 of SEQ ID NO:2, and 142-146 of SEQ IDNO:5; 
5 SEQ ID NO:3) which mediates interaction with PPARy, HNF-4a, and other nucleaf 
receptors. As used herein, a tyrosine phosphorylation site is an amino acid sequence 
which includes at least one tyrosine residue which can be phosphorylated by a tyrosine 
protein kinase. Typically, a tyrosine phosphorylation site is characterized by a lysine or 
an arginine about seven residues to the N-terminal side of the phosphorylated tyrosine. 
10 An acidic residue (asparagine or glutamine) is often found at either three or four residues 
to the N-terminal side of the tyrosine (Patschinsky, T. et al (1982) Proc. Natl. Acad. Sex. 
USA 79:973-977); Hunter, T. (1982) J. Biol. Chem. 257:4843-4848; Cooper, J.A. et al 
(1984) /. Biol Chem. 259:7835-7841). As used herein, a "cAMP phosphorylation site" 
is an amino acid sequence which includes a serine or threonine residue which can be 
1 5 phosphorylated by a cAMP-dependent protein kinase. Typically, the cAMP 

phosphorylation site is characterized by at least two consecutive basic residues to the N- 
terminal side of the serine or threonine (Fremisco, J.R. et al. (1980) J. Biol Chem. 
255:4240-4245; Glass, D. B. and Smith, S.B. (1983) J. Biol Chem. 258:14797-14803; 
Glass, D.B. et al. (1986) J. Biol Chem. 261 :2987-2993). As used herein, a "serine- 
20 arginine rich domain" or an "SR rich domain" is an amino acid sequence which is rich in 
serine and arginine residues. Typically, SR rich domains are domains which interact 
with the CTD domain of RNA polymerase II or are involved in splicing functions. As 
used herein, an "RNA binding motif' is an amino acid sequence which can bind an RNA 
molecule or a single stranded DNA molecule. RNA binding motifs are described in 
25 Lodish, H., Darnell, J., and Baltimore, D. Molecular Cell Biology, 3rd ed. (W .H. 
Freeman and Company, New York, New York, 1995). As used herein, an "LXXLL 
motif 5 (SEQ ID NO:3) refers to a motif wherein L represents leucine and X can be any 
amino acid, and which mediates an interaction between a nuclear receptor and a 
coactivator (Heery et al. (1997) Nature 397:733-736; Torchia et al. (1997) Nature 
30 387:677-684). 
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Various aspects of the invention are described in further detail in the following 
subsections: 

L Isolated Nucleic Acid Molecules 

5 One aspect of the invention pertains to methods utilizing isolated nucleic acid 

molecules that encode PGC-la or biologically active portions thereof, as well as nucleic 
acid fragments sufficient for use as hybridization probes to identify PGC- la-encoding 
nucleic acid (r.e., PGC-la mRNA). As used herein, the term "nucleic acid molecule" is 
intended to include DNA molecules (i.e., cDNA or genomic DNA) and RNA molecules 

1 0 (/. e. , mRNA) and analogs of the DNA or RNA generated using nucleotide analogs. The 
nucleic acid molecule can be single-stranded or double-stranded, but preferably is 
double-stranded DNA. An "isolated" nucleic acid molecule is one which is separated 
from other nucleic acid molecules which are present in the natural source of the nucleic 
acid. Preferably, an "isolated" nucleic acid is free of sequences which naturally flank the 

1 5 nucleic acid (/. e. , sequences located at the 5 * and 3 ' ends of the nucleic acid) in the 

genomic DNA of the organism from which the nucleic acid is derived. For example, in 
various embodiments, the isolated PGC-la nucleic acid molecule can contain less than 
about 5 kb, 4kb, 3kb, 2kb, 1 kb, 0.5 kb or 0.1 kb of nucleotide sequences which naturally 
flank the nucleic acid molecule in genomic DNA of the cell from which the nucleic acid 

20 is derived (i.e., a brown adipocyte). Moreover, an "isolated" nucleic acid molecule, such 
as a cDNA molecule, can be substantially free of other cellular material, or culture 
medium when produced by recombinant techniques, or chemical precursors or other 
chemicals when chemically synthesized. 

A nucleic acid molecule of the present invention, i.e. 9 a nucleic acid molecule 

25 having the nucleotide sequence of SEQ ID NO:l, SEQ ID NO:4 or a nucleotide 

sequence which is at least about 50%, preferably at least about 60%, more preferably at 
least about 70%, yet more preferably at least about 80%, still more preferably at least 
about 90%, and most preferably at least about 95% or more homologous to the 
nucleotide sequence shown in SEQ ID NO:l, SEQ ID NO:4 or a portion thereof (i.e., 

30 400, 450, 500, or more nucleotides), can be isolated using standard molecular biology 
techniques and the sequence information provided herein. For example, a human PGC- 
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la cDNA can be isolated from a human liver, heart, kidney, or brain cell line (from 
Stratagene, LaJolla, CA, or Clontech, Palo Alto, CA) using all or portion of SEQ ID 
NO:l or SEQ ID NO:4 as a hybridization probe and standard hybridization techniques 
(i.e., as described in Sambrook, J., Fritsh, E. F., and Maniatis, T. Molecular Cloning: A 

5 Laboratory Manual 2nd, ed, Cold Spring Harbor Laboratory, Cold Spring Harbor 
Laboratory Press, Cold Spring Harbor, NY, 1989). Moreover, a nucleic acid molecule 
encompassing all or a portion of SEQ ID NO: 1 or SEQ ED NO:4 or a nucleotide 
sequence which is at least about 50%, preferably at least about 60%, more preferably at 
least about 70%, yet more preferably at least about 80%, still more preferably at least 

10 about 90%, and most preferably at least about 95% or more homologous to the 

nucleotide sequence shown in SEQ ID NO: 1 or SEQ ID NO:4 can be isolated by the 
polymerase chain reaction using oligonucleotide primers designed based upon the 
sequence of SEQ ID NO:l or SEQ ID NO:4 or the homologous nucleotide sequence. 
For example, mRNA can be isolated from liver cells, heart cells, kidney cells, brain 

1 5 cells, or brown adipocytes (i. e. , by the guanidinium-thiocyanate extraction procedure of 
Chirgwin et ah (1979) Biochemistry 18: 5294-5299) and cDNA can be prepared using 
reverse transcriptase {i.e., Moloney MLV reverse transcriptase, available from 
Gibco/BRL, Bethesda, MD; or AMV reverse transcriptase, available from Seikagaku 
America, Inc., St Petersburg, FL). Synthetic oligonucleotide primers for PCR 

20 amplification can be designed based upon the nucleotide sequence shown in SEQ ID 

NO:I or SEQ ID NO:4 or to the homologous nucleotide sequence. A nucleic acid of the 
invention can be amplified using cDNA or, alternatively, genomic DNA, as a template 
and appropriate oligonucleotide primers according to standard PCR amplification 
techniques. The nucleic acid so amplified can be cloned into an appropriate vector and 

25 characterized by DNA sequence analysis. Furthermore, oligonucleotides corresponding 
to a PGC-lct nucleotide sequence can be prepared by standard synthetic techniques, 
using an automated DNA synthesizer. 

In a preferred embodiment, an isolated nucleic acid molecule of the invention 
comprises the nucleotide sequence shown in SEQ ID NO:l or SEQ ID NO:4 or a 

30 nucleotide sequence which is at least about 50%, preferably at least about 60%, more 
preferably at least about 70%, yet more preferably at least about 80%, still more 
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preferably at least about 90%, and most preferably at least about 95% or more 
homologous to the nucleotide sequence shown in SEQ ID NO: 1 or SEQ ID NO:4. The 
sequence of SEQ ID NO:4 corresponds to the mouse PGC-la cDNA. This cDNA 
comprises sequences encoding the PGC-la protein (z.e., "the coding region", from 

5 nucleotides 92 to 2482), as well as 5* untranslated sequences (nucleotides 1 to 91) and 3' 
untranslated sequences (nucleotides 2483 to 3066). Alternatively, the nucleic acid 
molecule can comprise only the coding region of SEQ ID NO:4 (z. e., nucleotides 92 to 
2482) or the homologous nucleotide sequence. The sequence of SEQ ID NO:l 
corresponds to the human PGC-la cDNA. This cDNA comprises sequences encoding 

10 the PGC-la protein (i.e. 9 "the coding region", from nucleotides 89 to 2482), as well as 5' 
untranslated sequences (nucleotides 1 to 88) and 3' untranslated sequences (nucleotides 
2513 to 3023). Alternatively, the nucleic acid molecule can comprise only the coding 
region of SEQ ID NO: 1 (i.e., nucleotides 89 to 2482) or the homologous nucleotide 
sequence. 

15 In another preferred embodiment, an isolated nucleic acid molecule of the 

invention comprises a nucleic acid molecule which is a complement of the nucleotide 
sequence shown in SEQ ID NO: 1 or SEQ ID NO:4 or a nucleotide sequence which is at 
least about 50%, preferably at least about 60%, more preferably at least about 70%, yet 
more preferably at least about 80%, still more preferably at least about 90%, and most 

20 preferably at least about 95% or more homologous to the nucleotide sequence shown in 
SEQ ID NO:l or SEQ ID NO:4. A nucleic acid molecule which is complementary to the 
nucleotide sequence shown in SEQ ID NO:l or SEQ ID NO:4 or to a nucleotide 
sequence which is at least about 50%, preferably at least about 60%, more preferably at 
least about 70%, yet more preferably at least about 80%, still more preferably at least 

25 about 90%, and most preferably at least about 95% or more homologous to the 
nucleotide sequence shown in SEQ ID NO:l or SEQ ID NO:4 is one which is 
sufficiently complementary to the nucleotide sequence shown in SEQ ID NO:l or SEQ 
ID NO:4 or to the homologous sequence such that it can hybridize to the nucleotide 
sequence shown in SEQ ID NO:l or SEQ ID NO:4 or to the homologous sequence, 

30 thereby forming a stable duplex. 
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In still another preferred embodiment, an isolated nucleic acid molecule of the 
invention comprises a nucleotide sequence which is at least about 50%, preferably at 
least about 60%, more preferably at least about 70%, yet more preferably at least about 
80%, still more preferably at least about 90%, and most preferably at least about 95% or 
5 more homologous to the nucleotide sequence shown in SEQ ID NO: 1 or SEQ ID NO:4 
or a portion of this nucleotide sequence. In an additional preferred embodiment, an 
isolated nucleic acid molecule of the invention comprises a nucleotide sequence which 
hybridizes, i.e. 9 hybridizes under stringent conditions, to the nucleotide sequence shown 
in SEQ ID NO:l or SEQ ID NO:4 or to a nucleotide sequence which is at least about 
1 0 50%, preferably at least about 60%, more preferably at least about 70%, yet more 
preferably at least about 80%, still more preferably at least about 90%, and most 
preferably at least about 95% or more homologous to the nucleotide sequence shown in 
SEQ ID NO:l or SEQ ID NO:4. 

Moreover, the nucleic acid molecule of the invention can comprise only a portion 
15 of the coding region of SEQ ID NO: 1 or SEQ ID NO:4 or the coding region of a 

nucleotide sequence which is at least about 50%, preferably at least about 60%, more 
preferably at least about 70%, yet more preferably at least about 80%, still more 
preferably at least about 90%, and most preferably at least about 95% or more 
homologous to the nucleotide sequence shown in SEQ ID NO:l or SEQ ID NO:4, for 
20 example a fragment which can be used as a probe or primer or a fragment encoding a 
biologically active portion of PGC-la. The nucleotide sequence determined from the 
cloning of the PGC-la gene from a mouse or human allows for the generation of probes 
and primers designed for use in identifying and/or cloning other PGC-la family 
members, as well as PGC-la homologies in other cell types, le. from other tissues, as 
25 well as PGC-la homologues from other mammals such as rats or monkeys. The 
probe/primer typically comprises substantially purified oligonucleotide. The 
oligonucleotide typically comprises a region of nucleotide sequence that hybridizes 
under stringent conditions to at least about 12, preferably at least about 25, more 
preferably about 40, 50 or 75 consecutive nucleotides of SEQ ID NO:l or SEQ ID NO:4 
30 sense, an anti-sense sequence of SEQ ID NO: 1 or SEQ ID NO:4, or naturally occurring 
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mutants thereof. Primers based on the nucleotide sequence in SEQ ID NO:l or SEQ ID 
NO:4 can be used in PCR reactions to clone PGC-la homologues. 

In an exemplary embodiment, a nucleic acid molecule of the present invention 
comprises a nucleotide sequence which is about 100, preferably 100-200, preferably 

5 200-300, more preferably 300-400, and even more preferably 400-487 nucleotides in 
length and hybridizes under stringent hybridization conditions to a nucleic acid molecule 
of SEQ ID NO:l or SEQ ID NO:4. 

Probes based on the PGC-la nucleotide sequences can be used to detect 
transcripts or genomic sequences encoding the same or homologous proteins. In 

1 0 preferred embodiments, the probe further comprises a label group attached thereto, 1 e. 
the label group can be a radioisotope, a fluorescent compound, an enzyme, or an enzyme 
co-factor. Such probes can be used as a part of a diagnostic test kit for identifying cells 
or tissue which misexpress a PGC-la protein, such as by measuring a level ofaPGC- 
la-encoding nucleic acid in a sample of cells from a subject i.e., detecting PGC-la 

1 5 mRNA levels or deter mining whether a genomic PGC-1 a gene has been mutated or 
deleted. 

In one embodiment, the nucleic acid molecule of the invention encodes a protein 
or portion thereof which includes an amino acid sequence which is sufficiently 
homologous to an amino acid sequence of SEQ ID NO:2 or SEQ ID NO:5 such that the 

20 protein or portion thereof maintains one or more of the following biological activities: 1) 
it can modulate the expression of myoglobin, troponin I slow, troponin I fast, MCAD, 
COX II, COX IV, and/or cytochrome c; 2) it can modulate coactivation of MEF2 
transcription factors; 3) it can modulate type I muscle formation; 4) it can modulate the 
conversion of type II muscle fibers into type I muscle fibers; 5) it can modulate the 

25 response of muscle fibers to exercise induced fatigue; and/or 6) it can treat diseases or 
disorders characterized by aberrant PGC-la expression or activity, i.e. 9 heart failure, 
disuse atrophy, mitochondrial myopathy, and/or systemic metabolic disease. 

As used herein, the language "sufficiently homologous" refers to proteins or 
portions thereof which have amino acid sequences which include a minimum number of 

30 identical or equivalent (i.e., an amino acid residue which has a similar side chain as an 
amino acid residue in SEQ ID NO:2 or SEQ ID NO:5) amino acid residues to an amino 
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acid sequence of SEQ ID NO:2 or SEQ ID NO:5 such that the protein or portion thereof 
tnaintflins one or more of the following biological activities: 1) modulation of the 
expression of myoglobin, troponin I slow, troponin I fast, MCAD, COX n, COX IV, 
and/or cytochrome c; 2) modulate coactivation of MEF2 transcription factors; 3) 
5 modulation of type I muscle formation; 4) modulation of the conversion of type II 

muscle fibers into type I muscle fibers; 5) modulation of the response of muscle fibers to 
exercise induced fatigue; and/or 6) treatment of diseases or disorders characterized by 
aberrant PGC-la expression or activity, Le. 9 heart Mure, disuse atrophy, mitochondrial 
myopathy, and/or systemic metabolic disease. 
1 0 In another embodiment, the protein is at least about 50%, preferably at least 

about 60%, more preferably at least about 70%, yet more preferably at least about 80%, 
still more preferably at least about 90%, and most preferably at least about 95% or more 
homologous to the entire amino acid sequence of SEQ ID NO:2 or SEQ ID NO:5. 
Portions of proteins encoded by the PGC-la nucleic acid molecule of the 
15 invention are preferably biologically active portions of the PGC-la protein. As used 
herein, the term "biologically active portion of PGC-la" is intended to include a portion, 
/.<?., a domain/motif, of PGC-la that has one or more of the following activities: 1) 
modulation of the expression of myoglobin, troponin I slow, troponin I fast, MCAD, 
COX II, COX IV, and/or cytochrome c; 2) modulate coactivation of MEF2 transcription 
20 factors; 3) modulation of type I muscle formation; 4) modulation of the conversion of 
type II muscle fibers into type I muscle fibers; 5) modulation of the response of muscle 
fibers to exercise induced fatigue; and/or 6) treatment of diseases or disorders 
characterized by aberrant PGC-la expression or activity, i.e. 9 heart failure, disuse 
atrophy, mitochondrial myopathy, and/or systemic metabolic disease. Standard 
25 binding assays, I e. , immunoprecipitations and yeast two-hybrid assays, as described 
herein, can be performed to determine the ability of a PGC-la protein or a biologically 
active portion thereof to interact with (i.e., bind to) HNF-4a, FKHR, the PEPCK 
promoter, PPARy, C/EBPa, NRF-1, or nuclear hormone receptors (/.<?., known 
molecules which interact with PGC-la). If a PGC-la family member is found to 
30 interact with HNF-4a, FKHR, the PEPCK promoter, PPARy, C/EBPa, NRF-1, or 

nuclear hormone receptors, then they are also likely to be modulators of the activity of 
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HNF-4a, FKHR, the PEPCK promoter, PPARy, C/EBPa, NRF-1, or nuclear hormone 
receptors. 

To determine whether a PGC-la family member of the present invention 
modulates myoglobin, troponin I slow, troponin I fast, MCAD, COX II, COX IV, and/or 
5 cytochrome c expression, in vitro transcriptional assays can be performed. To perform 
such an assay, the full length promoter/enhancer region of the gene of interest (i.e., 
myoglobin, troponin I slow, troponin I fast, MCAD, COX II, COX IV, and/or 
cytochrome c) can be linked to a reporter gene such as chloramphenicol acetyltransferase 
(CAT) or luciferase and introduced into host cells (Le. 9 liver cells such as Fao hepatoma 

10 cells, or COS cells). The same host cells can then be transfected with a nucleic acid 
molecule encoding the PGC-la molecule. In some embodiments, nucleic acid 
molecules encoding HNF-4a, FKHR, NRF-1, and/or PPARy/RXRcc can also be 
transfected. The effect of the PGC-la molecule can be measured by testing CAT or 
luciferase activity and comparing it to CAT or luciferase activity in cells which do not 

15 contain nucleic acid encoding the PGC-la molecule. An increase or decrease in CAT or 
luciferase activity indicates a modulation of expression of the gene of interest Because 
myoglobin, troponin I slow, MCAD, COX n, COX IV, and cytochrome c are known to 
be markers of mitochondrial biogenesis and/or type I muscle formation, and troponin I 
fast is known to be a marker of type II muscle, this assay can also measure the ability of 

20 the PGC-la molecule to modulate type I muscle formation. 

The above described assay for testing the ability of a PGC-la molecule to 
modulate myoglobin, troponin I slow, troponin I fast, MCAD, COX II, COX IV, and 
cytochrome c expression can also be used to test the ability of the PGC-la molecule to 
modulate type I muscle formation. If a PGC-la molecule can modulate myoglobin, 

25 troponin I slow, troponin I fast, MCAD, COX II, COX IV, and/or cytochrome c 
expression, it can most likely modulate type I muscle formation. Alternatively, the 
ability of a PGC-la molecule to modulate type I muscle formation can be measured by 
introducing a PGC-la molecule into cells, i.e. 9 a muscle cells, and measuring the amount 
of type I and type II muscle fibers that form. 
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In one embodiment, the biologically active portion of PGC-la comprises at least 
one domain or motif. Examples of such domains/motifs include a tyrosine 
phosphorylation site, a cAMP phosphorylation site, a serine-arginine (SR) rich domain, 
an RNA binding motif, and an LXXLL (SEQ ID NO:3) motif which mediates 
5 interaction with HNF-4a and nuclear receptors. In one embodiment, the biologically 
active portion of the protein which includes the domain or motif can modulate 
differentiation of white adipocytes to brown adipocytes and/or thermogenesis in brown 
adipocytes or can modulate gluconeogenesis. In a preferred embodiment, the 
biologically active portion of the protein includes the domain or motif that can modulate 

10 mitochondrial biogenesis and/or type I muscle formation. These domains are described 
in detail herein. Additional nucleic acid fragments encoding biologically active portions 
of PGC-la can be prepared by isolating a portion of SEQ ID NO:l or SEQ ID NO:4 or a 
homologous nucleotide sequence, expressing the encoded portion of PGC-la protein or 
peptide (z.e., by recombinant expression in vitro) and assessing the activity of the 

1 5 encoded portion of PGC- 1 a protein or peptide. 

The invention further encompasses nucleic acid molecules that differ from the 
nucleotide sequence shown in SEQ ID NO:l or SEQ ID NO:4 (and portions thereof) due 
to degeneracy of the genetic code and thus encode the same PGC-la protein as that 
encoded by the nucleotide sequence shown in SEQ ID NO: 1 or SEQ ID NO.4. In 

20 another embodiment, an isolated nucleic acid molecule of the invention has a nucleotide 
sequence encoding a protein having an amino acid sequence shown in SEQ ID NO:2 or 
_ SEQ ID NO:5 or a protein having an amino acid sequence which is at least about 50%, 
preferably at least about 60%, more preferably at least about 70%, yet more preferably at 
least about 80%, still more preferably at least about 90%, and most preferably at least 

25 about 95% or more homologous to the amino acid sequence of SEQ ID NO:2 or SEQ ID 
NO:5. 

In addition to the mouse and human PGC-la nucleotide sequences shown in 
SEQ ID NO:l and SEQ ID NO:4, it will be appreciated by those skilled in the art that 
DNA sequence polymorphisms that lead to changes in the amino acid sequences of 
30 PGC-la may exist within a population a mammalian population, i.e. 9 a human 
population). Such genetic polymorphism in the PGC-la gene may exist among 
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individuals within a population due to natural allelic variation. As used herein, the terms 
"gene" and "recombinant gene" refer to nucleic acid molecules comprising an open 
reading frame encoding a PGC-1 a protein, preferably a mammalian, human, PGC- 
la protein. Such natural allelic variations can typically result in 1-5% variance in the 

5 nucleotide sequence of the PGC-la gene. Any and all such nucleotide variations and 
resulting amino acid polymorphisms in PGC-la that are the result of natural allelic 
variation and that do not alter the functional activity of PGC-la are intended to be within 
the scope of the invention. Moreover, nucleic acid molecules encoding PGC-la proteins 
from other species, and thus which have a nucleotide sequence which differs from the 

1 0 human or mouse sequences of SEQ ID NO: 1 and SEQ ID NO:4, are intended to be 
within the scope of the invention. Nucleic acid molecules corresponding to natural 
allelic variants and homologues of the mouse or human PGC-la cDNAs of the invention 
can be isolated based on their homology to the mouse or human PGC-la nucleic acid 
sequences disclosed herein using the mouse or human cDNA, or a portion thereof, as a 

1 5 hybridization probe according to standard hybridization techniques under stringent 
hybridization conditions (as described herein). 

Moreover, nucleic acid molecules encoding other PGC-la family members and 
thus which have a nucleotide sequence which differs from the PGC-1 a sequences of 
SEQ ID NO:l or SEQ ID NO:4 are intended to be within the scope of the invention. For 

20 example, the use of alternately-spliced isoforms of PGC-la, referred to herein as PGC- 
lb and PGC-lc, or a PGC-la homologue referred to herein as PGC-1 P may be used in 
the methods of the invention. The nucleotide and amino acid sequences of mouse PGC- 
lb (SEQ ID NOs:6 and 7, respectively) are described in U.S. Provisional Patent 
Application No. 60/303,468, incorporated herein by reference. The nucleotide and 

25 amino acid sequences of mouse PGC-lc (SEQ ID NOs:8 and 9, respectively) are also 
described in U.S. Provisional Patent Application No. 60/303,468. The nucleotide and 
amino acid sequences of human (SEQ ID NOs:10 and 1 1, respectively) and mouse (SEQ 
ID NO:s:12 and 13, respectively) PGC-1 p are described in U.S. Provisional Application 
No. 60/338,126 and in Lin, J. et al. (2002) J. Biol Cherru 277(3): 1645-8, incorporated 

30 herein by reference. The nucleotide and amino acid sequences of mouse PGC-1 p are 
also described in GenBank Accession Nos. AF453324 and AAL47054, respectively. 
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Additionally, other PGC-la family members, for example a PGC-3 cDNA, can 
be identified based on the nucleotide sequence of human PGC-la or mouse PGC-la. (It 
should be noted that a gene called PPARy coactivator 2, or PGC-2, has already been 
described in the literature (Castillo, G. et ah (1999) EMBO J. 18(13):3676-87). 
5 However, PGC-2 is both structurally and functionally unrelated to PGC-la.) Moreover, 
nucleic acid molecules encoding PGC-la proteins from different species, and thus which 
have a nucleotide sequence which differs from the PGC-la sequences of SEQ ID NO:l 
or SEQ ID NO:4 are intended to be within the scope of the invention. For example, rat 
or monkey PGC-la cDNA can be identified based on the nucleotide sequence of a 
10 human PGC-la. 

Accordingly, in another embodiment, an isolated nucleic acid molecule of the 
invention is at least 15 nucleotides in length and hybridizes under stringent conditions to 
the nucleic acid molecule comprising the nucleotide sequence of SEQ ID NO: 1 or SEQ 
ID NO:4 or a nucleotide sequence which is about 60%, preferably at least about 70%, 
1 5 more preferably at least about 80%, still more preferably at least about 90%, and most 
preferably at least about 95% or more homologous to the nucleotide sequence of SEQ ID 
NO:l or SEQ ID NO:4. In other embodiments, the nucleic acid is at least 30, 50, 100, 
250 or 500 nucleotides in length. As used herein, the term "hybridizes under stringent 
conditions" is intended to describe conditions for hybridization and washing under 
20 which nucleotide sequences at least 60% homologous to each other typically remain 
hybridized to each other. Preferably, the conditions are such that sequences at least 
about 65%, more preferably at least about 70%, and even more preferably at least about 
75% or more homologous to each other typically remain hybridized to each other. Such 
stringent conditions are known to those skilled in the art and can be found in Current 
25 Protocols in Molecular Biology, John Wiley & Sons, N.Y. (1989), 6.3.1-6.3.6. A 

preferred, non-limiting example of stringent hybridization conditions are hybridization 
in 6X sodium chloride/sodium citrate (SSC) at about 45°C, followed by one or more 
washes in 0.2 X SSC, 0.1% SDS at 50-65°C. Preferably, an isolated nucleic acid 
molecule of the invention that hybridizes under stringent conditions to the sequence of 
30 SEQ ID NO: 1 or SEQ ID NO:4 corresponds to a naturally-occurring nucleic acid 
molecule. As used herein, a "natui^ly-occuiring" nucleic acid molecule refers to an 
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RNA or DNA molecule having a nucleotide sequence that occurs in nature (z.e. , encodes 
a natural protein). In one embodiment, the nucleic acid encodes a natural human PGC- 
la. 

In addition to naturaUy-occurring allelic variants of the PGC-la sequence that 

5 may exist in the population, the skilled artisan will further appreciate that changes can be 
introduced by mutation into the nucleotide sequence of SEQ ID NO:l or SEQ ID NO:4, 
thereby leading to changes in the amino acid sequence of the encoded PGC-la protein, 
without altering the functional ability of the PGC-la protein. For example, nucleotide 
substitutions leading to amino acid substitutions at "non-essential" amino acid residues 

10 can be made in the sequence of SEQ ED NO:l or SEQ ID NO:4. A "non-essential" 

amino acid residue is a residue that can be altered from the wild-type sequence of PGC- 
la (Le., the sequence of SEQ ID NO:2 or SEQ ID NO:5) without altering the activity of 
PGC-la, whereas an "essential" amino acid residue is required for PGC-la activity. For 
example, amino acid residues involved in the interaction of PGC-la to binding partners 

15 or target molecules (I e. , those present in an LXXLL motif) are most likely essential 
residues of PGC-la. Other amino acid residues, however, {i.e., those that are not 
conserved or only semi-conserved between mouse and human) may not be essential for 
activity and thus are likely to be amenable to alteration without altering PGC-la activity. 
Furthermore, amino acid residues that are essential for PGC-1 a functions related to 

20 thermogenesis, adipogenesis, or gluconeogenesis, but not essential for PGC-la functions 
related to type I muscle formation, are likely to be amenable to alteration. 

Accordingly, another aspect of the invention pertains to nucleic acid molecules 
encoding PGC-la proteins that contain changes in amino acid residues that are not 
essential for PGC-la activity. Such PGC-la proteins differ in amino acid sequence 

25 from SEQ ID NO:2 or SEQ ED NO:5 yet retain at least one of the PGC-la activities 
described herein. In one embodiment, the isolated nucleic acid molecule comprises a 
nucleotide sequence encoding a protein, wherein the protein comprises an amino acid 
sequence at least about 60% homologous to the amino acid sequence of SEQ ID NO:2 or 
SEQ ID NO:5 and is capable of modulating type I muscle formation. Preferably, the 

30 protein encoded by the nucleic acid molecule is at least about 70% homologous, 

preferably at least about 80-85% homologous, still more preferably at least about 90%, 
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and most preferably at least about 95% homologous to the amino acid sequence of SEQ 
IDNO:2orSEQIDNO:5. 

"Sequence identity or homology", as used herein, refers to the sequence 
similarity between two polypeptide molecules or between two nucleic acid molecules. 

5 When a position in both of the two compared sequences is occupied by the same base or 
amino acid monomer subunit, le., if a position in each of two DNA molecules is 
occupied by adenine, then the molecules are homologous or sequence identical at that 
position. The percent of homology or sequence identity between two sequences is a 
function of the number of matching or homologous identical positions shared by the two 

10 sequences divided by the number of positions compared x 100. For example, if 6 of 10, 
of the positions in two sequences are the same then the two sequences are 60% 
homologous or have 60% sequence identity. By way of example, the DNA sequences 
ATTGCC and TATGGC share 50% homology or sequence identity. Generally, a 
comparison is made when two sequences are aligned to give maximum homology. 

15 Unless otherwise specified "loop out regions", le., those arising from, from deletions or 
insertions in one of the sequences are counted as mismatches. 

The comparison of sequences and determination of percent homology between 
two sequences can be accomplished using a mathematical algorithm. Preferably, the 
alignment can be performed using the Clustal Method. Multiple alignment parameters 

20 include GAP Penalty =10, Gap Length Penalty = 10. For DNA alignments, the pairwise 
alignment parameters can be Htuple=2, Gap penalty=5, Window=4, and Diagonal 
saved=4. For protein alignments, the pairwise alignment parameters can be Ktuple=l , 
Gap penalty=3, Window=5, and Diagonals Saved=5. 

In a preferred embodiment, the percent identity between two amino acid 

25 sequences is determined using the Needleman and Wunsch (J. Mol Biol (48):444-453 
(1970)) algorithm which has been incorporated into the GAP program in the GCG 
software package (available online), using either a Blossom 62 matrix or a PAM250 
matrix, and a gap weight of 16, 14, 12, 10, 8, 6, or 4 and a length weight of 1, 2, 3, 4, 5, 
or 6. In yet another preferred embodiment, the percent identity between two nucleotide 

30 sequences is determined using the GAP program in the GCG software package 

(available online), using a NWSgapdna.CMP matrix and a gap weight of 40, 50, 60, 70, 
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or 80 and a length weight of 1 , 2, 3, 4, 5, or 6. In another embodiment, the percent 
identity between two amino acid or nucleotide sequences is determined using the 
algorithm of E. Meyers and W. Miller (CABIOS, 4:1 1-17 (1989)) which has been 
incorporated into the ALIGN program (version 2.0) (available online), using a PAM120 

5 weight residue table, a gap length penalty of 12 and a gap penalty of 4. 

An isolated nucleic acid molecule encoding a PGC-la protein homologous to the 
protein of SEQ ID NO:2 or SEQ ID NO:5 can be created by introducing one or more 
nucleotide substitutions, additions or deletions into the nucleotide sequence of SEQ ID 
NO:l or SEQ ID NO:4 or a homologous nucleotide sequence such that one or more 

10 amino acid substitutions, additions or deletions are introduced into the encoded protein. 
Mutations can be introduced into SEQ ID NO:l or SEQ ID NO:4 or the homologous 
nucleotide sequence by standard techniques, such as site-directed mutagenesis and PCR- 
mediated mutagenesis. Preferably, conservative amino acid substitutions are made at 
one or more predicted non-essential amino acid residues. A "conservative amino acid 

15 substitution" is one in which the amino acid residue is replaced with an amino acid 

residue having a similar side chain. Families of amino acid residues having similar side 
chains have been defined in the art These families include amino acids with basic side 
chains (i.e. 9 lysine, arginine, histidine), acidic side chains (z.e., aspartic acid, glutamic 
acid), uncharged polar side chains (i.e., glycine, asparagine, glutamine, serine, threonine, 

20 tyrosine, cysteine), nonpolar side chains (le. , alanine, valine, leucine, isoleucine, 
proline, phenylalanine, methionine, tryptophan), beta-branched side chains (i. e. , 
threonine, valine, isoleucine) and aromatic side chains (i.e., tyrosine, phenylalanine, 
tryptophan, histidine). Thus, a predicted nonessential amino acid residue in PGC-la is 
preferably replaced with another amino acid residue from the same side chain family. 

25 Alternatively, in another embodiment, mutations can be introduced randomly along all 
or part of a PGC-la coding sequence, such as by saturation mutagenesis, and the 
resultant mutants can be screened for a PGC-la activity described herein to identify 
mutants that retain PGC-la activity. Following mutagenesis of SEQ ID NO:l or SEQ 
ID NO:4, the encoded protein can be expressed recombinantly (as described herein) and 

30 the activity of the protein can be determined using, for example, assays described herein. 
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In addition to the nucleic acid molecules encoding PGC-la proteins described 
above, another aspect of the invention pertains to isolated nucleic acid molecules which 
are antisense thereto. An "antisense" nucleic acid comprises a nucleotide sequence 
which is complementary to a "sense" nucleic acid encoding a protein, ie. 9 
5 complementary to the coding strand of a double-stranded cDNA molecule or 

complementary to an mRNA sequence. Accordingly, an antisense nucleic acid can 
hydrogen bond to a sense nucleic acid. The antisense nucleic acid can be 
complementary to an entire PGC-la coding strand, or to only a portion thereof In one 
embodiment, an antisense nucleic acid molecule is antisense to a "coding region" of the 
10 coding strand of a nucleotide sequence encoding PGC-la. The term "coding region" 
refers to the region of the nucleotide sequence comprising codons which are translated 
into amino acid residues (ie., the entire coding region of SEQ ID NO:4 comprises 
nucleotides 92 to 2482, the entire coding region of SEQ ID NO:l comprises nucleotides 
89 to 2482). In another embodiment, the antisense nucleic acid molecule is antisense to 
15 a "noncoding region" of the coding strand of a nucleotide sequence encoding PGC-1 a. 
The term "noncoding region" refers to 5' and 3' sequences which flank the coding 
region that are not translated into amino acids (ie., also referred to as 5' and 3' 
untranslated regions). 

Given the coding strand sequences encoding PGC-la disclosed herein (ie., SEQ 
20 ID NO: 1 and SEQ ID NO:4), antisense nucleic acids of the invention can be designed 
according to the rules of Watson and Crick base pairing. The antisense nucleic acid 
molecule can be complementary to the entire coding region of PGC-la mRNA, but more 
preferably is an oligonucleotide which is antisense to only a portion of the coding or 
noncoding region of PGC-la mRNA. For example, the antisense oligonucleotide can be 
25 complementary to the region surrounding the translation start site of PGC-1 a mRNA. 
An antisense oligonucleotide can be, for example, about 5, 10, 15, 20, 25, 30, 35, 40, 45 
or 50 nucleotides in length. An antisense nucleic acid of the invention can be 
constructed using chemical synthesis and enzymatic ligation reactions using procedures 
known in the art. For example, an antisense nucleic acid (ie., an antisense 
30 oligonucleotide) can be chemically synthesized using naturally occurring nucleotides or 
variously modified nucleotides designed to increase the biological stability of the 



26 



WO 03/068944 



PCT/US03/04792 



molecules or to increase the physical stability of the duplex formed between the 
antisense and sense nucleic acids, i.e., phosphorothioate derivatives and acridine 
substituted nucleotides can be used. Examples of modified nucleotides which can be 
used to generate the antisense nucleic acid include 5-fluorouracil, 5-bromouracil, 5- 
5 chlorouracil, 5-iodouracil, hypoxanthine, xanthine, 4-acetylcytosine, 5- 

(carboxyhydroxylmethyl) uracil, 5-carboxymethylaminomethyl-2-thiouridine, 5- 
carboxymethylaminomethyluracil, dihydrouracil, beta-D-galactosylqueosine, inosine, 
N6-isopentenyladenine, 1-methylguanine, 1-methylinosine, 2,2-dimethylguanine, 2- 
methyladenine, 2-methylguanine, 3-methylcytosine, 5-methylcytosine, N6-adenine, 7- 

10 methylguanine, 5-methylaminomelhyluracil, 5-methoxyaminomethyl-2-tMouracil, beta- 
D-mannosylqueosine, 5'-methoxycarboxymethyluracil, 5-methoxyuracil, 2-methylthio- 
N6-isopentenyladenine, uracil-5-oxyacetic acid (v), wybutoxosine, pseudouracil, 
queosine, 2-thiocytosine, 5-methyl-2-thiouracil, 2-thiouracil, 4-thiouracil, 5- 
methyluracil, uracil-5- oxyacetic acid methylester, uracil-5-oxy acetic acid (v), 5-methyl- 

1 5 2-thiouracil, 3-(3-ammo-3-N-2-carboxypropyl) uracil, (acp3)w, and 2,6-diaminopurine. 
Alternatively, the antisense nucleic acid can be produced biologically using an 
expression vector-into which a nucleic acid has been subcloned in an antisense 
orientation {i.e. , RNA transcribed from the inserted nucleic acid will be of an antisense 
orientation to a target nucleic acid of interest, described further in the following 

20 subsection). 

The antisense nucleic acid molecules of the invention are typically administered 
to a subject or generated in situ such that they hybridize with or bind to cellular mRNA 
and/or genomic DNA encoding a PGC-la protein to thereby inhibit expression of the 
protein, Le. , by inhibiting transcription and/or translation. The hybridization can be by 

25 conventional nucleotide complementarity to form a stable duplex, or, for example, in the 
case of an antisense nucleic acid molecule which binds to DNA duplexes, through 
specific interactions in the major groove of the double helix. An example of a route of 
administration of an antisense nucleic acid molecule of the invention includes direct 
injection at a tissue site. Alternatively, an antisense nucleic acid molecule can be 

30 modified to target selected cells and then administered systemically. For example, for 
systemic administration, an antisense molecule can be modified such that it specifically 
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binds to a receptor or an antigen expressed on a selected cell surface, ie. 9 by linking the 
antisense nucleic acid molecule to a peptide or an antibody which binds to a cell surface 
receptor or antigen. The antisense nucleic acid molecule can also be delivered to cells 
using the vectors described herein. To achieve sufficient intracellular concentrations of 

5 the antisense molecules, vector constructs in which the antisense nucleic acid molecule 
is placed under the control of a strong pol II or pol m promoter are preferred. 

In yet another embodiment, the antisense nucleic acid molecule of the invention 
is an a-anomeric nucleic acid molecule. An oc-anomeric nucleic acid molecule forms 
specific double-stranded hybrids with complementary RNA in which, contrary to the 

10 usual p-units, the strands run parallel to each other (Gaultier et al (1987) Nucleic Acids 
Res. 15:6625-6641). The antisense nucleic acid molecule can also comprise a 2 9 ^o- 
me%Mbonucleotide(Inoueef fl/.(1987^ 15:6131-6148) or a 

chimeric RNA-DNA analogue (Inoue et al (1987) FEES Lett. 215:327-330). 

1 5 In still another embodiment, an antisense nucleic acid of the invention is a 

ribozyme. Ribozymes are catalytic RNA molecules with ribonuclease activity which are 
capable of cleaving a single-stranded nucleic acid, such as an mRNA, to which they have » 
a complementary region. Thus, ribozymes (/. e. , hammerhead ribozymes (described in 
Haseloff and Gerlach (1988) Nature 334:585-591)) can be used to catalytically cleave 

20 PGC-la mRNA transcripts to thereby inhibit translation of PGC-la mRNA. A 

ribozyme having specificity for a PGC-la -encoding nucleic acid can be designed based 
upon the nucleotide sequence of a PGC-la cDNA disclosed herein (ie., SEQ ID NO: 1 
or SEQ ID NO:4). For example, a derivative of a Tetrahymena L-19 IVS RNA can be 
constructed in which the nucleotide sequence of the active site is complementary to the 

25 nucleotide sequence to be cleaved in a PGC-la -encoding mRNA. See, i.e. , Cech et ah 
U.S. Patent No. 4,987,071 and Cech et al U.S. Patent No. 5,1 16,742. Alternatively, 
PGC-la mRNA can be used to select a catalytic RNA having a specific ribonuclease 
activity from apool of RNA molecules. See, i.e., Bartel, D. and Szostak, J.W. (1993) 
Science 261:1411-1418. 
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Alternatively, PGC-la gene expression can be inhibited by targeting nucleotide 
sequences complementary to the regulatory region of the PGC-la (i.e., the PGC-la 
promoter and/or enhancers) to form triple helical structures mat prevent transcription of 
the PGC-la gene in target cells. See generally, Helene, C. (1991) Anticancer Drug Des. 
5 6(6):569-84; Helene, C. et al. (1992) Ann. N. Y. Acad Sci. 660:27-36; and Maher, L.J. 
(1992) Bioassays 14(12):807-15. 

II. Recombinant Expression Vectors and Host Cells 

Another aspect of the invention pertains to the use of vectors, preferably 

1 0 expression vectors, containing a nucleic acid encoding PGC- 1 a (or a portion thereof). 
As used herein, the term 'Vector" refers to a nucleic acid molecule capable of 
transporting another nucleic acid to which it has been linked. One type of vector is a 
"plasmid", which refers to a circular double stranded DNA loop into which additional 
DNA segments can be ligated. Another type of vector is a viral vector, wherein 

1 5 additional DNA segments can be ligated into the viral genome. Certain vectors are 
capable of autonomous replication in a host cell into which they are introduced (i.e., 
bacterial vectors having a bacterial origin of replication and episomal mammalian 
vectors). Other vectors (i. e. , non-episomal mammalian vectors) are integrated into the 
genome of a host cell upon introduction into the host cell, and thereby are replicated 

20 along with the host genome. Moreover, certain vectors are capable of directing the 
expression of genes to which they are operatively linked. Such vectors are referred to 
herein as "expression vectors". In general, expression vectors of utility in recombinant 
DNA techniques are often in the form of plasmids. In the present specification, 
"plasmid" 811(1 "vector" can be used interchangeably as the plasmid is the most 

25 commonly used form of vector. However, the invention is intended to include such 
other forms of expression vectors, such as viral vectors (i.e., replication defective 
retroviruses, adenoviruses and adeno-associated viruses), which serve equivalent 
functions. 

The recombinant expression vectors of the invention comprise a nucleic acid of 
30 the invention in a form suitable for expression of the nucleic acid in a host cell, which 
means that the recombinant expression vectors include one or more regulatory 
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sequences, selected on the basis of the host cells to be used for expression, which is 
operatively linked to the nucleic acid sequence to be expressed. Within a recombinant 
expression vector, "operably linked" is intended to mean that the nucleotide sequence of 
interest is linked to the regulatory sequence(s) in a manner which allows for expression 

5 of the nucleotide sequence (i. e., in an in vitro transcription/translation system or in a host 
cell when the vector is introduced into the host cell). The term "regulatory sequence" is 
intended to includes promoters, enhancers and other expression control elements 
polyadenylation signals). Such regulatory sequences are described, for example, in 
Goeddel; Gene Expression Technology: Methods in Enzymology 185, Academic Press, 

1 0 San Diego, CA (1990). Regulatory sequences include those which direct constitutive 
expression of a nucleotide sequence in many types of host cell and those which direct 
expression of the nucleotide sequence only in certain host cells (z.e., tissue-specific 
regulatory sequences). In a preferred embodiment, a muscle specific promoter is used to 
direct expression of the nucleotide sequence in muscle (e.g., in a type I muscle cell or in 

15 a type II muscle cell). Muscle specific promoters include, without limitation, the muscle 
creatine kinase promoter, the dystrophin promoter, the myostatin promoter, the GDF-8 
promoter, the UCP-3 promoter, the MyoD promoter, the MEF2 the promoter, the myosin 
heavy chain promoter, the myosin light chain promoter, and a troponin promoter. It will 
be appreciated by those skilled in the art that the design of the expression vector can 

20 depend on such factors as the choice of the host cell to be transformed, the level of 
expression of protein desired, etc. The expression vectors of the invention can be 
introduced into host cells to thereby produce proteins or peptides, including fusion 
proteins or peptides, encoded by nucleic acids as described herein (i.e. 9 PGC-la proteins, 
mutant forms of PGC-la, fusion proteins, etc.). 

25 The recombinant expression vectors of the invention can be designed for 

expression of PGC-la in prokaiyotic or eukaryotic cells. For example, PGC-la can be 
expressed in bacterial cells such as E. coli, insect cells (using baculovirus expression 
vectors) yeast cells or mammalian cells. Suitable host cells are discussed further in 
Goeddel, Gene Expression Technology: Methods in Enzymology 185, Academic Press, 

30 San Diego, CA (1990). Alternatively, the recombinant expression vector can be 
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transcribed and translated in vitro, for example using T7 promoter regulatory sequences 
and T7 polymerase. 

Expression of proteins in prokaryotes is most often carried out in E. coli with 

< 

vectors containing constitutive or inducible promoters directing the expression of either 

5 fusion or non-fusion proteins. Fusion vectors add a number of amino acids to a protein 
encoded therein, usually to the amino terminus of the recombinant protein. Such fusion 
vectors typically serve three purposes: 1) to increase expression of recombinant protein; 
2) to increase the solubility of the recombinant protein; and 3) to aid in the purification 
of the recombinant protein by acting as a ligand in affinity purification. Often, in fusion 

10 expression vectors, a proteolytic cleavage site is introduced at the junction of the fusion 
moiety and the recombinant protein to enable separation of the recombinant protein from 
the fusion moiety subsequent to purification of the fusion protein. Such enzymes, and 
their cognate recognition sequences, include Factor Xa, thrombin and enterokinase. 
Typical fusion expression vectors include pGEX (Pharmacia Biotech Inc; Smith, D.B. 

1 5 and Johnson, K.S. (1988) Gene 67:3 1-40), pMAL (New England Biolabs, Beverly, MA) 
and pRIT5 (Pharmacia, Piscataway, NJ) which fuse glutathione S-transferase (GST), 
maltose E binding protein, or protein A, respectively, to the target recombinant protein. 
In one embodiment, the coding sequence of the PGC-la is cloned into a pGEX 
expression vector to create a vector encoding a fusion protein comprising, from the N- 

20 terminus to the C-terminus, GST-thrombin cleavage site- PGC-la. The fusion protein 
can be purified by affinity chromatography using glutathione-agarose resin. 
Recombinant PGC-la unfused to GST can be recovered by cleavage of the fusion 
protein with thrombin. 

Examples of suitable inducible non-fusion E. coli expression vectors include 

25 pTrc (Amann et aU (1988) Gene 69:301-315) and pET 1 Id (Studier et aL, Gene 
Expression Technology: Methods in Enzymology 185, Academic Press, San Diego, 
California (1990) 60-89). Target gene expression from the pTrc vector relies on host 
RNA polymerase transcription from a hybrid trp-lac fusion promoter. Target gene 
expression from the pET lid vector relies on transcription from a T7 gnlO-lac fusion 

30 promoter mediated by a coexpressed viral RNA polymerase (T7 gnl). This viral 

polymerase is supplied by host strains BL21(DE3) or HMS174(DE3) from a resident X 
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prophage harboring a T7 gnl gene under the transcriptional control of the lacUV 5 
promoter. 

One strategy to maximize recombinant protein expression in E. coli is to express 
the protein in a host bacteria with an impaired capacity to proteolytically cleave the 
5 recombinant protein (Gottesman, S., Gene Expression Technology: Methods in 
Enzymology 185, Academic Press, San Diego, California (1990) 1 19-128). Another 
strategy is to alter the nucleic acid sequence of the nucleic acid to be inserted into an 
expression vector so that the individual codons for each amino acid are those 
preferentially utilized in K coli (Wada et al (1992) Nucleic Acids Res, 20:2111-2118). 
10 Such alteration of nucleic acid sequences of the invention can be carried out by standard 
DNA synthesis techniques. 

In another embodiment, the PGC-la expression vector is a yeast expression 
vector. Examples of vectors for expression in yeast S. cerivisae include pYepSecl 
(Baldari, et al, (1987) EMBO 1 6:229-234), pMFa (Kurjan and Herskowitz, (1982) Cell 
15 30:933-943), pJRY88 (Schultz et al, (1987) Gene 54:1 13-123), and pYES2 (Invitrogen 
Corporation, San Diego, CA). 

Alternatively, PGC-la can be expressed in insect cells using baculovirus 
expression vectors. Baculovirus vectors available for expression of proteins in cultured 
insect cells (i.e., Sf 9 cells) include the pAc series (Smith et al (1983) Mol Cell Biol 
20 3:2156-2165) and the pVL series (Lucklow and Summers (1989) Virology 170:31-39). 
In yet another embodiment, a nucleic acid of the invention is expressed in 
mammalian cells using a mammalian expression vector. Examples of mammalian 
expression vectors include pCDM8 (Seed, B. (1987) Nature 329:840) and pMT2PC 
(Kaufinan et al (1987) EMBO J, 6:187-195). When used in mammalian cells, the 
25 expression vector's control functions are often provided by viral regulatory elements. 
For example, commonly used promoters are derived from polyoma, Adenovirus 2, 
cytomegalovirus and Simian Virus 40. For other suitable expression systems for both 
prokaryotic and eukaryotic cells see chapters 16 and 17 of Sambrook, J., Fritsh, E. F., 
and Maniatis, T. Molecular Cloning: A Laboratory Manual 2nd, ed, Cold Spring 
30 Harbor Laboratory, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY, 
1989. 
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In another embodiment, the recombinant mammalian expression vector is 
capable of directing expression of the nucleic acid preferentially in a particular cell type 
(i.e., tissue-specific regulatory elements are used to express the nucleic acid). Tissue- 
specific regulatory elements are known in the art. Non-limiting examples of suitable 

5 tissue-specific promoters include the muscle specific casein kinase promoter, the 
albumin promoter (liver-specific; Pinkert et al (1987) Genes Dev. 1 268-277), 
lymphoid-specific promoters (Calame and Eaton (1988) Adv. Immunol 43:235-275), in 
particular promoters of T cell receptors (Winoto and Baltimore (1989) EMBO J. 8:729- 
733) and immunoglobulins (Banerji et al. (1983) Cell 33:729-740; Queen and Baltimore 

10 (1983) Cell 33:741-748), neuron-specific promoters (i.e., the neurofilament promoter; 
Byrne and Ruddle (1989) Proc. Natl. Acad. Sci. USA 86:5473-5477), pancreas-specific 
promoters (Edlund et al (1985) Science 230:912-916), and mammary gland-specific 
promoters (i.e., milk whey promoter; U.S. Patent No. 4,873,316 and European 
Application Publication No. 264,166). Developmentally-regulated promoters are also 

15 encompassed, for example the murine hox promoters (Kessel and Gross (1990) Science 
249:374-379) and the ct-fetoprotein promoter (Campes and Tilghman (1989) Genes Dev. 
3:537-546). 

The invention further provides a recombinant expression vector comprising a 
DNA molecule of the invention cloned into the expression vector in an antisense 

20 orientation. That is, the DNA molecule is operatively linked to a regulatory sequence in 
a manner which allows for expression (by transcription of the DNA molecule) of an 
RNA molecule which is antisense to PGC-la mRNA. Regulatory sequences operatively 
linked to a nucleic acid cloned in the antisense orientation can be chosen which direct 
the continuous expression of the antisense RNA molecule in a variety of cell types, for 

25 instance viral promoters and/or enhancers, or regulatory sequences can be chosen which 
direct constitutive, tissue specific or cell type specific expression of antisense RNA. The 
antisense expression vector can be in the form of a recombinant plasmid, phagemid or 
attenuated virus in which antisense nucleic acids are produced under the control of a 
high efficiency regulatory region, the activity of which can be determined by the cell 

30 type into which the vector is introduced. For a discussion of the regulation of gene 
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expression using antisense genes see Weintraub, H. et al, Antisense RNA as a molecular 
tool for genetic analysis, Reviews - Trends in Genetics, Vol. 1(1) 1986. 

Another aspect of the invention pertains to host cells into which a recombinant 
expression vector of the invention has been introduced. The terms "host cell" and 

5 "recombinant host cell" are used interchangeably herein. It is understood that such 
terms refer not only to the particular subject cell but to the progeny or potential progeny 
of such a cell. Because certain modifications may occur in succeeding generations due 
to either mutation or environmental influences, such progeny may not, in fact, be 
identical to the parent cell, but are still included within the scope of the term as used 

10 herein- 

A host cell can be any prokaryotic or eukaryotic cell. For example, PGC-la 
protein can be expressed in bacterial cells such as E. coli, insect cells, yeast or 
mammalian cells (such as muscle cells, Chinese hamster ovary cells (CHO) or COS 
cells). Other suitable host cells are known to those skilled in the art. 

15 Vector DNA can be introduced into prokaryotic or eukaryotic cells via 

conventional transformation or transfection techniques. As used herein, the terms 
'Hransfonnation" and 'transfection" are intended to refer to a variety of art-recognized 
techniques for introducing foreign nucleic acid (i.e., DNA) into a host cell, including 
calcium phosphate or calcium chloride co-precipitation, DEAE-dextran-mediated 

20 transfection, lipofection, or electroporation. Suitable methods for transforming or 
transfecting host cells can be found in Sambrook, et al (Molecular Cloning: A 
Laboratory Manual 2nd ed, Cold Spring Harbor Laboratory, Cold Spring Harbor 
Laboratory Press, Cold Spring Harbor, NY, 1989), and other laboratory manuals. 

For stable transfection of mammalian cells, it is known that, depending upon the 

25 expression vector and transfection technique used, only a small fraction of cells may 
integrate the foreign DNA into their genome. In order to identify and select these 
integrants, a gene that encodes a selectable marker (i.e., resistance to antibiotics) is 
generally introduced into the host cells along with the gene of interest Preferred 
selectable markers include those which confer resistance to drugs, such as G41 8, 

30 hygromycin and methotrexate. Nucleic acid encoding a selectable marker can be 
introduced into a host cell on the same vector as that encoding PGC-la or can be 
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introduced on a separate vector.* Cells stably transfected with the introduced nucleic acid 
can be identified by drug selection cells that have incorporated the selectable 
marker gene will survive, while the other cells die). 

A host cell of the invention, such as a prokaryotic or eukaryotic host cell in 

5 culture, can be used to produce (ie. 9 express) PGC-la protein. Accordingly, the 

invention further provides methods for producing PGC-la protein using the host cells of 
the invention. In one embodiment, the method comprises culturing the host cell of 
invention (into which a recombinant expression vector encoding PGC-la has been 
introduced) in a suitable medium until PGC-la is produced. In another embodiment, the 

10 method further comprises isolating PGC-la from the medium or the host cell. 

HI. Transgenic Animals 

The host cells of the invention can also be used to produce nonhuman transgenic 
animals. The nonhuman transgenic animals (ie., mice, rats, monkeys, horses, dogs, 

15 turkeys, fish, cows, pigs, sheep, goats, frogs, or chickens) can be used, for example, in 
screening assays designed to identify agents or compounds, i.e., drugs, pharmaceuticals, 
etc.* which are involved with type I muscle formation and/or capable of ameliorating 
detrimental symptoms of type I muscle associated disorders. 

For example, in one embodiment, a host cell of the invention is a fertilized 

20 oocyte or an embryonic stem cell into which PGC- la-coding sequences have been 

introduced. Such host cells can then be used to create non-human transgenic animals in 
which exogenous PGC-la sequences have been introduced into their genome or 
homologous recombinant animals in which endogenous PGC-la sequences have been 
altered. Such animals are useful for studying the function and/or activity of PGC-la and 

25 for identifying and/or evaluating modulators of PGC-la activity. As used herein, a 
"transgenic animal" is a nonhuman animal, preferably a mammal, more preferably a 
rodent such as a rat or mouse, in which one or more of the cells of the animal includes a 
transgene. Other examples of transgenic animals include nonhuman primates, sheep, 
dogs, cows, goats, chickens, amphibians, etc. A transgene is exogenous DNA which is 

30 integrated into the genome of a cell from which a transgenic animal develops and which 
remains in the genome of the mature animal, thereby directing the expression of an 
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encoded gene product in one or more cell types or tissues of the transgenic animal. As 
used herein, a "homologous recombinant animal" is a nonhuman animal, preferably a 
mammal, more preferably a mouse, in which an endogenous PGC-la gene has been 
altered by homologous recombination between the endogenous gene and an exogenous 

5 DNA molecule introduced into a cell of the animal, i.e., an embryonic cell of the animal, 
prior to development of the animal. 

A transgenic animal of the invention can be created by introducing PGC-la - 
encoding nucleic acid into the male pronuclei of a fertilized oocyte, i.e., by 
microinjection, retroviral infection, and allowing the oocyte to develop in a 

10 pseudopregnant female foster animal. The human PGC-la cDNA sequence can be 
introduced as a transgene into the genome of a nonhuman animal. Alternatively, a 
nonhuman homologue of the human PGC-la gene (SEQ ID NO:l), such as a mouse 
PGC-la gene (SEQ ID NO:4), can used as a transgene. Intronic sequences and 
polyadenylation signals can also be included in the transgene to increase the efficiency 

15 of expression of the transgene. A tissue-specific regulatory sequence(s) can be operably 
linked to the PGC-la transgene to direct expression of PGC-la protein to particular 
cells. Methods for generating transgenic animals via embryo manipulation and 
microinjection, particularly animals such as mice, have become conventional in the art 
and are described, for example, in U.S. Patent Nos. 4,736,866 and 4,870,009, both by 

20 Leder et aL, U.S. Patent No. 4,873,1 91 by Wagner et al. and in Hogan, B., Manipulating 
the Mouse Embryo, (Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y., 
1986). Similar methods are used for production of other transgenic animals. A 
transgenic founder animal can be identified based upon the presence of the PGC-la 
transgene in its genome and/or expression of PGC-la mRNA in tissues or cells of the 

25 animals. A transgenic founder animal can then be used to breed additional animals 
carrying the transgene. Moreover, transgenic animals carrying a transgene encoding 
PGC-la can further be bred to other transgenic animals carrying other transgenes. 

To create a homologous recombinant animal, a vector is prepared which contains 
at least a portion of a PGC-la gene into which a deletion, addition or substitution has 

30 been introduced to thereby alter, i.e. 9 functionally disrupt, the PGC-la gene. The PGC- 
la gene can be a human gene {i.e., from a human genomic clone isolated from a human 
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genomic library screened with the cDNA of SEQ ID NO:l), but more preferably, is a 
nonhuman homologue of a human PGC-la gene. For example, a mouse PGC-la gene 
can be used to construct a homologous recombination vector suitable for altering an 
endogenous PGC-la gene in the mouse genome. In a preferred embodiment, the vector 

5 is designed such that, upon homologous recombination, the endogenous PGC-la gene is 
functionally disrupted (i.e. 9 no longer encodes a functional protein; also referred to as a 
"knock out" vector). Alternatively, the vector can be designed such that, upon 
homologous recombination, the endogenous PGC-la gene is mutated or otherwise 
altered but still encodes functional protein (i.e., the upstream regulatory region can be 

1 0 altered to thereby alter the expression of the endogenous PGC-1 a protein). In the 

homologous recombination vector, the altered portion of the PGC-la gene is flanked at 
its 5' and 3' ends by additional nucleic acid of the PGC-la gene to allow for 
homologous recombination to occur between the exogenous PGC-la gene carried by the 
vector and an endogenous PGC-la gene in an embryonic stem cell. The additional 

15 flanking PGC-la nucleic acid is of sufficient length for successful homologous 

recombination with the endogenous gene. Typically, several kilobases of flanking DNA 
(both at the 5' and 3' ends) are included in the vector (see i.e. 9 Thomas, K.R. and 
Capecchi, M. R. (1987) Cell 51:503 for a description of homologous recombination 
vectors). The vector is introduced into an embryonic stem cell line (i.e., by 

20 electroporation) and cells in which the introduced PGC-la gene has homologously 

recombined with the endogenous PGC-la gene are selected (see Le. 9 Li, E. et ah (1992) 
Cell 69:915). The selected cells are then injected into a blastocyst of an animal (i.e., a 
mouse) to form aggregation chimeras (see i.e., Bradley, A. in Teratocarcinomas and 
Embryonic Stem Cells: A Practical Approach, E.J. Robertson, ed. (IRL, Oxford, 1987) 

25 pp. 1 13-152). A chimeric embryo can then be implanted into a suitable pseudopregnant 
female foster animal and the embryo brought to term. Progeny harboring the 
homologously recombined DNA in their germ cells can be used to breed animals in 
which all cells of the animal contain the homologously recombined DNA by germline 
transmission of the transgeae. Methods for constructing homologous recombination 

30 vectors and homologous recombinant animals are described further in Bradley, A. 
(1991) Current Opinion in Biotechnology 2:823-829 and in PCT International 
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Publication Nos.: WO 90/1 1354 by Le Mouellec et aL\ WO 91/01 140 by Smithies et 
al; WO 92/0968 by Zijlstra et al\ and WO 93/04169 by Bems et al 

In one embodiment of the invention, transgenic animals are created using a 
vector containing a muscle specific promoter operatively linked to a PGC-1 a nucleic 
5 acid molecule. Non-limiting examples of muscle specific promoters include muscle 
creatine kinase, dystrophin, myostatin (Gonzalez-Cadavid, N.F. et al. (1998) Proc. Natl 
Acad, Set USA 95(25): 14938-43), GDF-8 (PCT International Publication No. WO 
00/04051), UCP-3, MyoD, MEF2, myosin heavy chain, myosin light chain, and various 
forms of troponin. 

10 In another preferred embodiment of the invention, transgenic mouse strains were 

generated which express PGC-1 a from the muscle creatine kinase promoter. The PGC- 
1 a cDNA sequence was placed under the control of a muscle-specific promoter (muscle 
creatine kinase (MCK) promoter). Transgenic mice were generated using DNA 
microinjection and screened by PCR. Four independent founder lines were obtained 

15 (line #29, line #23, line #26, and line #31) and mated with wild type mice to obtain 
progeny for use in experiments. 

Lines #23 and #3 1 show strong PGC-1 a mRNA expression, line #26 shows low 
PGC-la expression, while line #29 shows little PGC-la expression. These mice show a 
PGC-la dose-dependant increase in the expression of type I specific marker gene 

20 expression in the muscle, a dose-dependant decrease in the expression of type II specific 
marker gene expression in the muscle, and an increase in type I muscle fiber content, as 
determined by metachromatic and anti-myosin histological analysis. The transgenic 
mice have a greatly increased amount of dark-colored (type I) muscle throughout their 
entire bodies, including the hind-limb muscles. More specifically, the gastrocnemius 

25 muscle (normally a type II muscle) is the same dark color in the transgenic mice as the 
soleus (type I) muscle. The muscle fibers isolated from the transgenic mice also are 
more resistant to exercise-induced fatigue, a hallmark for slow-twitch muscle fibers and 
muscles following endurance training. 

In another embodiment, transgenic nonhuman animals can be produced which 

30 contain selected systems which allow for regulated expression of the transgene. One 
example of such a system is the cre/loxP recombinase system of bacteriophage PI . For 
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a description of the cre/loxP recombinase system, see, ie. 9 Lakso et al (1992) Proc. 
Natl Acad. Sci. USA 89:6232-6236. Another example of a recombinase system is the 
FLP recombinase system of Saccharomyces cerevisiae (O'Gorman et al (1991) Science 
251:1351-1355. If a cre/loxP recombinase system is used to regulate expression of the 
5 transgene, animals containing transgenes encoding both the Ore recombinase and a 

selected protein are required. Such animals can be provided through the construction of 
"double" transgenic animals, Le. 9 by mating two transgenic animals, one containing a 
transgene encoding a selected protein and the other containing a transgene encoding a 
recombinase. 

1 0 Clones of the nonhuman transgenic animals described herein can also be 

produced according to the methods described in Wilmut, I. et al (1997) Nature 385:810- 
813 and PCT International Publication Nos. WO 97/07668 and WO 97/07669. In brief, 
a cell, i.e. , a somatic cell, from the transgenic animal can be isolated and induced to exit 
the growth cycle and enter G 0 phase. The quiescent cell can then be fused, ie., through 

15 the use of electrical pulses, to an enucleated oocyte from an animal of the same species 
from which the quiescent cell is isolated. The reconstructed oocyte is then cultured such 
that it develops to morula or blastocyst and then transferred to pseudopregnant female 
foster animal. The offspring borne of this female foster animal will be a clone of the 
animal from which the cell, i.e. 9 the somatic cell, is isolated. 

20 

IV. Isolated PGC-1 a Proteins and Anti- PGC-la Antibodies 

Another aspect of the invention pertains to the use of isolated PGC-la proteins, 
and biologically active portions thereof, as well as peptide fragments suitable for use as 
immunogens to raise anti- PGC-la antibodies. An "isolated" or "purified" protein or 

25 biologically active portion thereof is substantially free of cellular material when 

produced by recombinant DNA techniques, or chemical precursors or other chemicals 
when chemically synthesized. The language "substantially free of cellular material" 
includes preparations of PGC-la protein in which the protein is separated from cellular 
components of the cells in which it is naturally or recombinant^ produced. In bne 

30 embodiment, the language "substantially free of cellular material" includes preparations 
of PGC-la protein having less than about 30% (by dry weight) of non- PGC-la protein 
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(also referred to herein as a "contaminating protein"), more preferably less than about 
20% of non- PGC-la protein, still more preferably less than about 10% of non- PGC-la 
protein, and most preferably less than about 5% non- PGC-la protein. When the PGC- 
1 a protein or biologically active portion thereof is recombinantly produced, it is also 

5 preferably substantially free of culture medium, ie. 9 culture medium represents less than 
about 20%, more preferably less than about 10%, and most preferably less than about 
5% of the volume of the protein preparation. The language "substantially free of 
chemical precursors or other chemicals" includes preparations of PGC-la protein in 
which the protein is separated from chemical precursors or other chemicals which are 

10 involved in the synthesis of the protein. In one embodiment, the language "substantially 
free of chemical precursors or other chemicals" includes preparations of PGC-la protein 
having less than about 30% (by dry weight) of chemical precursors or non- PGC-la 
chemicals, more preferably less than about 20% chemical precursors or non- PGC-la 
chemicals, still more preferably less than about 10% chemical precursors or non- PGC- 

15 la chemicals, and most preferably less than about 5% chemical precursors or non- PGC- 
la chemicals. In preferred embodiments, isolated proteins or biologically active 
portions thereof lack contaminating proteins from the same animal from which the PGC- 
la protein is derived. Typically, such proteins are produced by recombinant expression 
of, for example, a human PGC-la protein in a nonhuman cell. 

20 An isolated PGC-la protein or a portion thereof of the invention has one or more 

of the following biological activities: 1) modulation of the expression of myoglobin, 
troponin I slow, troponin I fast, MCAD, COX n, COX IV, and/or cytochrome c; 2) 
modulate coactivation of MEF2 transcription factors; 3) modulation of type I muscle 
formation; 4) modulation of the conversion of type II muscle fibers into type I muscle 

25 fibers; 5) modulation of the response of muscle fibers to exercise induced fatigue; and/or 
6) treatment of diseases or disorders characterized by aberrant PGC-la expression or 
activity, z.e., heart failure, disuse atrophy, mitochondrial myopathy, and/or systemic 
metabolic disease. La preferred embodiments, the protein or portion thereof 
comprises an amino acid sequence which is sufficiently homologous, to an amino acid 

30 sequence of SEQ ID NO:2 or SEQ ID NO:5 such that the protein or portion thereof 
maintains the ability to modulate gluconeogenesis. The portion of the protein is 
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preferably a biologically active portion as described herein. In another preferred 
embodiment, the PGC-la protein (ie. 9 amino acid residues 1-797 or amino acid residues 
1-798) has an amino acid sequence shown in SEQ ID NO:2 or SEQ ID NO:5, 
respectively, or an amino acid sequence which is at least about 50%, preferably at least 

5 about 60%, more preferably at least about 70%, yet more preferably at least about 80%, 
still more preferably at least about 90%, and most preferably at least about 95% or more 
homologous to the amino acid sequence shown in SEQ ID NO:2 or SEQ ID NO:5. In 
yet another preferred embodiment, the PGC-la protein has an amino acid sequence 
which is encoded by a nucleotide sequence which hybridizes, ie. , hybridizes under 

10 stringent conditions, to the nucleotide sequence of SEQ ID NO:l or SEQ ID NO:4 or a 
nucleotide sequence which is at least about 50%, preferably at least about 60%, more 
preferably at least about 70%, yet more preferably at least about 80%, still' more 
preferably at least about 90%, and most preferably at least about 95% or more 
homologous to the nucleotide sequence shown in SEQ ID NO:l, SEQ ID NO:4. The 

1 5 preferred PGC- 1 a proteins of the present invention also preferably possess at least one of 
the PGC-la biological activities described herein. For example, a preferred PGC-la 
protein of the present invention includes an amino acid sequence encoded by a 
nucleotide sequence which hybridizes, i.e., hybridizes under stringent conditions, to the 
nucleotide sequence of SEQ ID NO:l or SEQ ID NO:4 and which can modulate 

20 gluconeogenesis. 

In other embodiments, the PGC-la protein is substantially homologous to the 
amino acid sequence of SEQ ID NO:2 or SEQ ID NO:5 and retains the functional 
activity of the protein of SEQ ID NO:2 or SEQ ID NO:5 yet differs in amino acid 
sequence due to natural allelic variation or mutagenesis, as described in detail in 

25 subsection I above. Accordingly, in another embodiment, the PGC-la protein is a 

protein which comprises an amino acid sequence which is at least about 50%, preferably 
at least about 60%, more preferably at least about 70%, yet more preferably at least 
about 80%, still more preferably at least about 90%, and most preferably at least about 
95% or more homologous to the amino acid sequence of SEQ ID NO:2, SEQ ID NO:5. 

30 
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Biologically active portions of the PGC-la protein include peptides comprising 
amino acid sequences derived from the amino acid sequence of the PGC-la protein, r.e., 
the amino acid sequence shown in SEQ ID NO:2 or SEQ ID NO:5 or the amino acid 
sequence of a protein homologous to the PGC-la protein, which include fewer amino 

5 acids than the full length PGC-la protein or the full length protein which is homologous 
to the PGC-la protein, and exhibit at least one activity of the PGC-la protein. 
Typically, biologically active portions (peptides, i.e., peptides which are, for example, 5, 
10, 15, 20, 30, 35, 36, 37, 38, 39, 40, 50, 100 or more amino acids in length) comprise a 
domain or motif, i.e., a tyrosine phosphorylation site, a cAMP phosphorylation site, a 

1 0 serine-arginine (SR) rich domain, and/or an RNA binding motif, with at least one 
activity of the PGC-la protein. In a preferred embodiment, the biologically active 
portion of the protein which includes one or more the domains/motifs described herein 
can modulate type I muscle formation, mitochondrial biogenesis, as well as 
differentiation of adipocytes and/or thermogenesis in brown adipocytes, and/or 

1 5 gluconeogenesis. Moreover, other biologically active portions, in which other regions of 
the protein are deleted, can be prepared by recombinant techniques and evaluated for one 
or more of the activities described herein. Preferably, the biologically active portions of 
the PGC-la protein include one or more selected domains/motifs or portions thereof 
having biological activity. 

20 PGC-la proteins are preferably produced by recombinant DNA techniques. For 

example, a nucleic acid molecule encoding the protein is cloned into an expression 
vector (as described above), the expression vector is introduced into a host cell (as 
described above) and the PGC-la protein is expressed in the host cell. The PGC-la 
protein can then be isolated from the cells by an appropriate purification scheme using 

25 standard protein purification techniques. Alternative to recombinant expression, a PGC- 
la protein, polypeptide, or peptide can be synthesized chemically using standard peptide 
synthesis techniques. Moreover, native PGC-la protein can be isolated from cells (i.e. 9 
brown adipocytes), for example using an anti- PGC-la antibody (described further 
below). 
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The invention also provides PGC-la chimeric or fusion proteins. As used 
herein, a PGC-la "chimeric protein" or "fusion protein" comprises a PGC-la 
polypeptide operatively linked to a non- PGC-la polypeptide. A "PGC-la polypeptide" 
refers to a polypeptide having an amino acid sequence corresponding to PGC-la, 
5 whereas a "non- PGC-la polypeptide" refers to a polypeptide having an amino acid 
sequence corresponding to a protein which is not substantially homologous to the PGC- 
la protein, i.e., a protein which is different from the PGC-la protein and which is 
derived from the same or a different organism. Within the fusion protein, the term 
"operatively linked" is intended to indicate that the PGC-la polypeptide and the non- 
10 PGC-la polypeptide are fused in-frame to each other. The non- PGC-la polypeptide 
can be fused to the N-terminus or C-terminus of the PGC-la polypeptide. For example, 
in one embodiment the fusion protein is a GST- PGC-1 a fusion protein in which the 
PGC-la sequences are fused to the C-terminus of the GST sequences. Such fusion 
proteins can facilitate the purification of recombinant PGC-la. In another embodiment, 
15 the fusion protein is a PGC-la protein containing a heterologous signal sequence at its 
• N-terminus. In certain host cells (i. e. , mammalian host cells), expression and/or 
secretion of PGC-la can be increased through use of a heterologous signal sequence. 

Preferably, a PGC-la chimeric or fusion protein of the invention is produced by 
standard recombinant DNA techniques. For example, DNA fragments coding for the 
20 different polypeptide sequences are ligated together in-frame in accordance with 
conventional techniques, for example by employing blunt-ended or stagger-ended 
termini for ligation, restriction enzyme digestion to provide for appropriate termini, 
filling-in of cohesive ends as appropriate, alkaline phosphatase treatment to avoid 
undesirable joining, and enzymatic ligation. In another embodiment, the fusion gene can 
25 be synthesized by conventional techniques including automated DNA synthesizers. 
Alternatively, PCR amplification of gene fragments can be carried out using anchor 
primers which give rise to complementary overhangs between two consecutive gene 
fragments which can subsequently be annealed and reamplified to generate a chimeric 
gene sequence (see, for example, Current Protocols in Molecular Biology, eds. Ausubel 
30 et ah John Wiley & Sons: 1992). Moreover, many expression vectors are commercially 
available that already encode a fusion moiety {i.e., a GST polypeptide). A PGC-la - 
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encoding nucleic acid can be cloned into such an expression vector such that the fusion 

moiety is linked in-frame to the PGC- la protein. 

The present invention also pertains to homologues of the PGC- la proteins which 

function as either a PGC-la agonist (mimetic) or a PGC- la antagonist. In a preferred 
5 embodiment, the PGC-1 a agonists and antagonists stimulate or inhibit, respectively, a 

subset of the biological activities of the naturally occurring form of the PGC-la protein. 

Thus, specific biological effects can be elicited by treatment with a homologue of 

limited function. In one embodiment, treatment of a subject with a homologue having a 

subset of the biological activities of the naturally occurring form of the protein has fewer 
1 0 side effects in a subject relative to treatment with the naturally occurring form of the > 

PGC-la protein. 

Homologues of the PGC-la protein can be generated by mutagenesis, ie. 9 
discrete point mutation or truncation of the PGC-la protein. As used herein, the term 
"homologue" refers to a variant form of the PGC-la protein which acts as an agonist or 
1 5 antagonist of the activity of the PGC-la protein. An agonist of the PGC-la protein can 
retain substantially the same, or a subset, of the biological activities of the PGC-la 
protein. An antagonist of the PGC-la protein can inhibit one or more of the activities of 
the naturally occurring form of the PGC-la protein, by, for example, competitively 
binding to a downstream or upstream member of the PGC-la cascade which includes the 
20 PGC-la protein. Thus, the mammalian PGC-la protein and homologues thereof of the 
present invention can be, for example, either positive or negative regulators of adipocyte 
differentiation and/or thermogenesis in brown adipocytes. 

In an alternative embodiment, homologues of the PGC-la protein can be 
identified by screening combinatorial libraries of mutants, i.e. 9 truncation mutants, of the 
25 PGC-la protein for PGC-la protein agonist or antagonist activity. In one embodiment, 
a variegated library of PGC-la variants is generated by combinatorial mutagenesis at the 
nucleic acid level and is encoded by a variegated gene library. A variegated library of 
PGC-la variants can be produced by, for example, enzymatically ligating a mixture of 
synthetic oligonucleotides into gene sequences such that a degenerate set of potential 
30 PGC-la sequences is expressible as individual polypeptides, or alternatively, as a set of 
larger fusion proteins for phage display) containing the set of PGC-la sequences 
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therein. There are a variety of methods which can be used to produce libraries of 
potential PGC-la homologues from a degenerate oligonucleotide sequence. Chemical 
synthesis of a degenerate gene sequence can be performed in an automatic DNA 
synthesizer, and the synthetic gene then ligated into an appropriate expression vector. 

5 Use of a degenerate set of genes allows for the provision, in one mixture, of all of the 
sequences encoding the desired set of potential PGC-la sequences. Methods for 
synthesizing degenerate oligonucleotides are known in the art (see, Narang, S.A. 
(1983) Tetrahedron 39:3; Itakura et al (1984) Annu. Rev, Biochem. 53:323; Itakura et 
al. (\9S4)Science 198:1056; Dee et al (1983) Nucleic Acid Res. 11:477. 

1 0 In addition, libraries of fragments of the PGC- 1 a protein coding can be used to 

generate a variegated population of PGC-la fragments for screening and subsequent 
selection of homologues of a PGC-la protein. In one embodiment, a library of coding 
sequence fragments can be generated by treating a double stranded PCR fragment of a 
PGC-la coding sequence with a nuclease under conditions wherein nicking occurs only 

1 5 about once per molecule, denaturing the double stranded DNA, renaturing the DNA to 
form double stranded DNA which can include sense/antisense pairs from different 
nicked products, removing single stranded portions from reformed duplexes by treatment 
with SI nuclease, and ligating the resulting fragment library into an expression vector. 
By this method, an expression library can be derived which encodes N-terminal, C- 

20 terminal and internal fragments of various sizes of the PGC-la protein. 

Several techniques are known in the art for screening gene products of 
combinatorial libraries made by point mutations or truncation, and for screening cDNA 
libraries for gene products having a selected property. Such techniques are adaptable for 
rapid screening of the gene libraries generated by the combinatorial mutagenesis of 

25 PGC-la homologues. The most widely used techniques, which are amenable to high 
through-put analysis, for screening large gene libraries typically include cloning the gene 
library into replicable expression vectors, transforming appropriate cells with the 
resulting library of vectors, and expressing the combinatorial genes under conditions in 
which detection of a desired activity facilitates isolation of the vector encoding the gene 

30 whose product was detected. Recursive ensemble mutagenesis (REM), a new technique 
which enhances the frequency of functional mutants in the libraries, can be used in 
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combination with the screening assays to identify PGC-la homologies (Arkin and 
Youvan (1992) Proa Natl Acad. Sci. USA 59:7811-7815; Delagrave etal (1993) 
Protein Eng. 6(3):327-331). 

An isolated PGC-la protein, or a portion or fragment thereof, can be used as an 

5 immunogen to generate antibodies that bind PGC-la using standard techniques for 

polyclonal and monoclonal antibody preparation. The full-length PGC-la protein can be 
used or, alternatively, the invention provides antigenic peptide fragments of PGC-la for 
use as immunogens. The antigenic peptide of PGC-la comprises at least 8 amino acid 
residues of the amino acid sequence shown in SEQ ID NO:2, SEQ ID NO:5 or a 

1 0 homologous amino acid sequence as described herein and encompasses an epitope of 
PGC-la such that an antibody raised against the peptide forms a specific immune 
complex with PGC-la. Preferably, the antigenic peptide comprises at least 10 amino 
acid residues, more preferably at least 15 amino acid residues, even more preferably at 
least 20 amino acid residues, and most preferably at least 30 amino acid residues. 

1 5 Preferred epitopes encompassed by the antigenic peptide are regions of PGC- 1 a that are 
located on the surface of the protein, Le., hydrophilic regions. 

A PGC- 1 a immunogen typically is used to prepare antibodies by immunizing a 
suitable subject, (/.<?., rabbit, goat, mouse or other mammal) with the immunogen. An 
appropriate immunogenic preparation can contain, for example, recombinantly expressed 

20 PGC-la protein or a chemically synthesized PGC-la peptide. The preparation can 
further include an adjuvant, such as Freund's complete or incomplete adjuvant, or 
similar immunostimulatory agent Immunization of a suitable subject with an 
immunogenic PGC-la preparation induces a polyclonal and- PGC-la antibody 
response. 

25 Accordingly, another aspect of the invention pertains to anti- PGC-la antibodies. 

The term "antibody" as used herein refers to immunoglobulin molecules and 
immunologically active portions of immunoglobulin molecules, i.e., molecules that 
contain an antigen binding site which specifically binds (immunoreacts with) an antigen, 
such as PGC-la. Examples of immunologically active portions of immunoglobulin 

30 molecules include F(ab) and F(ab')2 fragments which can be generated by treating the 
antibody with an enzyme such as pepsin. The invention provides polyclonal and 
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monoclonal antibodies that bind PGC-la. The term "monoclonal antibody" or 
"monoclonal antibody composition", as used herein, refers to a population of antibody 
molecules that contain only one species of an antigen binding site capable of 
immunoreacting with a particular epitope of PGC-la. A monoclonal antibody 
5 composition thus typically displays a single binding affinity for a particular PGC-la 
protein with which it immunoreacts. 

Polyclonal anti- PGC-la antibodies can be prepared as described above by 
immunizing a suitable subject with a PGC-la immunogen. The anti- PGC-la antibody 
titer in the immunized subject can be monitored over time by standard techniques, such 

10 as with an enzyme linked immunosorbent assay (ELISA) using immobilized PGC-la. If 
desired, the antibody molecules directed against PGC-la can be isolated from the 
mammal (i. e. , from the blood) and further purified by well known techniques, such as 
protein A chromatography to obtain the IgG fraction. At an appropriate time after 
immunization, i.e., when the anti- PGC-la antibody titers are highest, antibody- 

1 5 producing cells can be obtained from the subject and used to prepare monoclonal 

antibodies by standard techniques, such as the hybridoma technique originally described 
by Kohler and Milstein (1975) Nature 256:495-497) (see also, Brown et al (1981) 1 
Immunol 127:539-46; Brown et al (1980) 1 Biol Chem. 255:4980-83; Yeh et al 
(1976) Proc. Natl Acad Set USA 76:2927-31; and Yeh et al (1982) Int. J. Cancer 

20 29:269-75), the more recent human B cell hybridoma technique (Kozbor et al (1983) 
Immunol Today 4:72), the EBV-hybridoma technique (Cole et al (1985), Monoclonal 
Antibodies and Cancer Therapy, Alan R. Liss, Inc., pp. 77-96) or trioma techniques. 
The technology for producing monoclonal antibody hybridomas is well known (see 
generally R. H. Kenneth, in Monoclonal Antibodies: A New Dimension In Biological 

25 Analyses, Plenum Publishing Corp., New York, New York (1980); E. A. Lerner (1981) 
Yale J. Biol Med t 54:387-402; M. L. Gefter et al (1977) Somatic Cell Genet 
3 :23 1-36). Briefly, an immortal cell line (typically a myeloma) is fused to lymphocytes 
(typically splenocytes) from a mammal immunized with a PGC-la immunogen as 
described above, and the culture supernatants of the resulting hybridoma cells are 

30 screened to identify a hybridoma producing a monoclonal antibody that binds PGC-la. 
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Any of the many well known protocols used for fusing lymphocytes and 
immortalized cell lines can be applied for the purpose of generating an anti- PGC-la 
monoclonal antibody (see, le. 9 G. Galfre et al (1977) Nature 266:55052; Gefter et al 
Somatic Cell Genet, cited supra; Lerner, Yale J. Biol Med, cited supra; Kenneth, 

5 Monoclonal Antibodies, cited supra). Moreover, the ordinarily skilled worker will 
appreciate that there are many variations of such methods which also would be useful. 
Typically, the immortal cell line (f.e., a myeloma cell line) is derived from the same 
mammalian species as the lymphocytes. For example, murine hybridomas can be made 
by fusing lymphocytes from a mouse immunized with an immunogenic preparation of 

10 the present invention with an immortalized mouse cell line. Preferred immortal cell 
lines are mouse myeloma cell lines that are sensitive to culture medium containing 
hypoxanthine, aminopterin and thymidine ("HAT medium"). Any of a number of 
myeloma cell lines can be used as a fusion partner according to standard techniques, i.e. 9 
the P3-NSl/l-Ag4-l, P3-x63-Ag8.653 or Sp2/0-Agl4 myeloma lines. These myeloma 

1 5 lines are available from ATCC. Typically, HAT-sensitive mouse myeloma cells are 
fused to mouse splenocytes using polyethylene glycol ("PEG")- Hybridoma cells 
resulting from the fusion are then selected using HAT medium, which kills unfused and 
unproductively fused myeloma cells (unfused splenocytes die after several days because 
they are not transformed). Hybridoma cells producing a monoclonal antibody of the 

20 invention are detected by screening the hybridoma culture supernatants for antibodies 
that bind PGC-la, Le. 9 using a standard ELISA assay. 

Alternative to preparing monoclonal antibody-secreting hybridomas, a 
monoclonal anti- PGC-la antibody can be identified and isolated by screening a 
recombinant combinatorial immunoglobulin library (i.e. 9 an antibody phage display 

25 library) with PGC-la to thereby isolate immunoglobulin library members that bind 
PGC-la. Kits for generating and screening phage display libraries are commercially 
available (z.e., the Pharmacia Recombinant Phage Antibody System, Catalog No. 27- 
9400-01; and the Stratagene SurfZAP™ Phage Display Kit, Catalog No. 240612). 
Additionally, examples of methods and reagents particularly amenable for use in 

30 generating and screening antibody display library can be found in, for example, Ladner 
et al U.S. Patent No. 5,223,409; Kang et al PCT International Publication No. WO 
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92/18619; Dower et al PCT International Publication No. WO 91/17271; Winter et al 
PCT International Publication WO 92/20791 ; Markland et al. PCT International 
Publication No. WO 92/15679; Breitling et al. PCT International Publication WO 
93/01288; McCafferty et al. PCT International Publication No. WO 92/01047; Garrard 
5 et al. PCT International Publication No. WO 92/09690; Ladner et al. PCT International 
Publication No. WO 90/02809; Fuchs et al. (1991) Bio/Technology 9:1369-1372; Hay et 
al. (1992) Hum. Antibod Hybridomas 3:81-85; Huse et al. (1989) Science 246:1275- 
1281; Griffiths et al. (1993) EMBO J. 12:725-734; Hawkins et al. (1992) J. Mol. Biol. 
226:889-896; Clackson et al. (1991) Nature 352:624-628; Gram et al. (1992) Proc. Natl. 
10 Acad. Sci. USA 89:3576-3580; Ganard et al. (1991) Bio/Technology 9:1373-1377; 

Hoogenboom et al. (1991) Nucleic Acids Res. 19:4133-4137; Barbas et al. (1991) Proc. 
Natl. Acad Sci. USA 88:7978-7982; and McCafferty et al. Nature (1990) 348:552-554. 

Additionally, recombinant anti- PGC-la antibodies, such as chimeric and 
humanized monoclonal antibodies, comprising both human and non-human portions, 
1 5 which can be made using standard recombinant DNA techniques, are within the scope of 
the invention. Such chimeric and humanized monoclonal antibodies can be produced by 
. recombinant DNA techniques known in the art, for example using methods described in 
Robinson et al. International Application No. PCT/US86/02269; Akira, et al. European 
Patent Application 184,187; Taniguchi, M., European Patent Application 171,496; 
20 Morrison et al. European Patent Application 173 ,494; Neuberger et al. PCT 

International Publication No. WO 86/01533; Cabilly et al U.S. Patent No. 4,816,567; 
Cabilly et al. European Patent Application 125,023; Better et al. (1988) Science 
240:1041-1043; Liu et al. (1987) Proc. Natl. Acad Sci. USA 84:3439-3443; Liu et al. 
(1987) J. Immunol. 139:3521-3526; Sun etal (1987) Proc. Natl. Acad Sci. USA 84:214- 
25 218; Nishimura et al (1987) Cane. Res. 47:999-1005; Wood et al. (1985) Nature 

314:446-449; and Shaw et al (1988) J. Natl. Cancer Inst. 80:1553-1559); Morrison, S. 
L. (1985) Science 229:1202-1207; Oi et al. (1986) BioTechniques 4:214; Winter U.S. 
Patent 5,225,539; Jones et al (1986) Nature 321:552-525; Verhoeyan et al. (1988) 
Science 239:1534; and Beidler et al. (1988) J. Immunol. 141 :4053-4060. 
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An anti- PGC-la antibody (i.e., monoclonal antibody) can be used to isolate 
PGC-la by standard techniques, such as affinity chromatography or ^ 
immunoprecipitation. An anti- PGC-la antibody can facilitate the purification of natural 
PGC-la from cells and of recombinantly produced PGC-la expressed in host cells. 

5 Moreover, an anti- PGC-la antibody can be used to detect PGC-la protein (i.e. 9 in a 
cellular lysate or cell supernatant) in order to evaluate the abundance and pattern of 
expression of the PGC-la protein. Anti- PGC-la antibodies can be used diagnostically 
to monitor protein levels in tissue as part of a clinical testing procedure, i. e. , to, for 
example, determine the efficacy of a given treatment regimen. Detection can be 

1 0 facilitated by coupling (i. e. , physically linking) the antibody to a detectable substance. 
Examples of detectable substances include various enzymes, prosthetic groups, 
fluorescent materials, luminescent materials, bioluminescent materials, and radioactive 
materials. Examples of suitable enzymes include horseradish peroxidase, alkaline 
phosphatase, P-galactosidase, or acetylcholinesterase; examples of suitable prosthetic 

1 5 group complexes include streptavidinftiotin and avidin/biotin; examples of suitable 
fluorescent materials include umbelliferone, fluorescein, fluorescein isothiocyanate, 
rhodamine, dichlorotriazinylamine fluorescein, dansyl chloride or phycoerythrin; an 
example of a luminescent material includes luminol; examples of bioluminescent 
materials include luciferase, luciferin, and aequorin, and examples of suitable radioactive 

20 material include 125 1, 131 1, 35 S or 3 H. 

V. Pharmaceutical Compositions 

The PGC-la nucleic acid molecules, PGC-la proteins, PGC-la modulators, and 
anti- PGC-la antibodies (also referred to herein as "active compounds") of the invention 

25 can be incorporated into pharmaceutical compositions suitable for administration to a 
subject, i.e. , a human. Such compositions typically comprise the nucleic acid molecule, 
protein, modulator, or antibody and a pharmaceutically acceptable carrier. As used 
herein the language "pharmaceutically acceptable carrier" is intended to include any and 
all solvents, dispersion media, coatings, antibacterial and antifungal agents, isotonic and 

30 absorption delaying agents, and the like, compatible with pharmaceutical administration. 
The use of such media and agents for pharmaceutically active substances is well known 
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in the art. Except insofar as any conventional media or agent is incompatible with the 
active compound, such media can be used in the compositions of the invention. 
Supplementary active compounds can also be incorporated into the compositions. 

A pharmaceutical composition of the invention is formulated to be compatible 

5 with its intended route of administration. Examples of routes of administration include 
parenteral, i.e., intravenous, intradermal, subcutaneous, oral (/.e., inhalation), 
transdermal (topical), transmucosal, and rectal administration. Solutions or suspensions 
used for parenteral, intradermal, or subcutaneous application can include the following 
components: a sterile diluent such as water for injection, saline solution, fixed oils, 

10 polyethylene glycols, glycerine, propylene glycol or other synthetic solvents; 

antibacterial agents such as benzyl alcohol or methyl parabens; antioxidants such as 
ascorbic acid or sodium bisulfite; chelating agents such as ethylenediaminetetraacetic 
acid; buffers such as acetates, citrates or phosphates and agents for the adjustment of 
tonicity such as sodium chloride or dextrose. pH can be adjusted with acids or bases, 

15 such as hydrochloric acid or sodium hydroxide. The parenteral preparation can be 
enclosed in ampoules, disposable syringes or multiple dose vials made of glass or 
plastic. 

Pharmaceutical compositions suitable for injectable use include sterile aqueous 
solutions (where water soluble) or dispersions and sterile powders for the 

20 extemporaneous preparation of sterile injectable solutions or dispersion. For intravenous 
administration, suitable carriers include physiological saline, bacteriostatic water, 
Cremophor EL™ (BASF, Parsippany, NJ) or phosphate buffered saline (PBS). In all 
cases, the composition must be sterile and should be fluid to the extent that easy 
syringeability exists. It must be stable under the conditions of manufacture and storage 

25 and must be preserved against the contaminating action of microorganisms such as 
bacteria and fungi. The carrier can be a solvent or dispersion medium containing, for 
example, water, ethanol, polyol (for example, glycerol, propylene glycol, and liquid 
polyetheylene glycol, and the like), and suitable mixtures thereof. The proper fluidity 
can be maintained, for example, by the use of a coating such as lecithin, by the 

30 maintenance of the required particle size in the case of dispersion and by the use of 
surfactants. Prevention of the action of microorganisms can be achieved by various 
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antibacterial and antifungal agents, for example, parabens, chlorobutanol, phenol, 
ascorbic acid, thimerosal, and the like. In many cases, it will be preferable to include 
isotonic agents, for example, sugars, polyalcohols such as manitol, sorbitol, sodium 
chloride in the composition. Prolonged absorption of the injectable compositions can be 

5 brought about by including in the composition an agent which delays absorption, for 
example, aluminum monostearate and gelatin. 

Sterile injectable solutions can be prepared by incorporating the active compound 
(f.c, a PGC-la protein or anti- PGC-la antibody) in the required amount in an 
appropriate solvent with one or a combination of ingredients enumerated above, as 

1 0 required, followed by filtered sterilization. Generally, dispersions are prepared by 
incorporating the active compound into a sterile vehicle which contains a basic 
dispersion medium and the required other ingredients from those enumerated above. In 
the case of sterile powders for the preparation of sterile injectable solutions, the preferred 
methods of preparation are vacuum drying and freeze-drying which yields a powder of 

1 5 the active ingredient plus any additional desired ingredient from a previously sterile- 
filtered solution thereof. 

Oral compositions generally include an inert diluent or an edible carrier. They 
can be enclosed in gelatin capsules or compressed into tablets. For the purpose of oral 
therapeutic administration, the active compound can be incorporated with excipients and 

20 used in the form of tablets, troches, or capsules. Oral compositions can also be prepared 
using a fluid carrier for use as a mouthwash, wherein the compound in the fluid carrier is 
applied orally and swished and expectorated or swallowed. Pharmaceutical 
compatible binding agents, and/or adjuvant materials can be included as part of the 
composition. The tablets, pills, capsules, troches and the like can contain any of the 

25 following ingredients, or compounds of a similar nature: a binder such as 

microcry stalline cellulose, gum tragacanth or gelatin; an excipient such as starch or 
lactose, a disintegrating agent such as alginic acid, Primogel, or corn starch; a lubricant 
such as magnesium stearate or Sterotes; a glidant such as colloidal silicon dioxide; a 
sweetening agent such as sucrose or saccharin; or a flavoring agent such as peppermint, 

30 methyl salicylate, or orange flavoring. 
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For administration by inhalation, the compounds are delivered in the form of an 
aerosol spray from pressured container or dispenser which contains a suitable propellant, 
i. e. , a gas such as carbon dioxide, or a nebulizer. 

Systemic administration can also be by transmucosal or transdermal means. For 
5 transmucosal or transdermal administration, penetrants appropriate to the barrier to be 
permeated are used in the formulation. Such penetrants are generally known in the art, 
and include, for example, for transmucosal administration, detergents, bile salts, and 
fusidic acid derivatives. Transmucosal administration can be accomplished through the 
use of nasal sprays or suppositories. For transdermal administration, the active 
10 compounds are formulated into ointments, salves, gels, or creams as generally known in 
the art. 

The compounds can also be. prepared in the form of suppositories (i.e., with 
conventional suppository bases such as cocoa butter and other glycerides) or retention 
enemas for rectal delivery. 

15 In one embodiment, the active compounds are prepared with carriers that will 

protect the compound against rapid elimination from the body, such as a controlled 
release formulation, including implants and microencapsulated delivery systems. 
Biodegradable, biocompatible polymers can be used, such as ethylene vinyl acetate, 
polyanhydrides, polyglycolic acid, collagen, polyorthoesters, and polylactic acid. 

20 Methods for preparation of such formulations will be apparent to those skilled in the art 
The materials can also be obtained commercially from Alza Corporation and Nova 
Pharmaceuticals, Inc. Liposomal suspensions (including liposomes targeted to infected 
cells with monoclonal antibodies to viral antigens) can also be used as pharmaceutically 
acceptable carriers. These can be prepared according to methods known to those skilled 

25 in the art, for example, as described in U.S. Patent No. 4,522,8 1 1 . 

It is especially advantageous to formulate oral or parenteral compositions in 
dosage unit form for ease of administration and uniformity of dosage. Dosage unit form 
as used herein refers to physically discrete units suited as unitary dosages for the subject 
to be treated; each unit containing a predetermined quantity of active compound 

30 calculated to produce the desired therapeutic effect in association with the required 

pharmaceutical carrier. The specification for the dosage unit forms of the invention are 
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dictated by and directly dependent on the unique characteristics of the active compound 
and the particular therapeutic effect to be achieved, and the limitations inherent in the art 
of compounding such an active compound for the treatment of individuals. 

5 VI. Gene Therapy 

In a preferred embodiment, the nucleic acid molecules used in the methods of the 
invention can be inserted into vectors and used as gene therapy vectors. Gene therapy 
vectors can be delivered to a subject by, for example, intravenous injection, local 
administration (see U.S. Patent No. 5,328,470) or by stereotactic injection (see Chen 

10 et al (1994) Proc. Natl Acad. Set USA 91 :3054-3057). The pharmaceutical preparation 
of the gene therapy vector can include the gene therapy vector in an acceptable diluent, 
or can comprise a slow release matrix in which the gene delivery vehicle is imbedded. 
Alternatively, where the complete gene delivery vector can be produced intact from 
recombinant cells, Le. retroviral vectors, the pharmaceutical preparation can include one 

15 or more cells which produce the gene delivery system. 

Viral vectors include, for example, recombinant retroviruses, adenovirus, adeno- 
i associated virus, and herpes simplex virus- 1 . Retrovirus vectors and adeno-associated 
virus vectors are generally understood to be the recombinant gene delivery system of 
choice for the transfer of exogenous genes in vivo, particularly into humans. Adenovirus 

20 preferentially targets the liver when administered systemically (greater than 90+%; 
(Antinozzi et al (1999) Armu. Rev, Nutr. 19:51 1-544) for reasons that may have to do 
with the expression of viral receptors or the lack of vascular barriers in the liver. 
Alternatively they can be used for introducing exogenous genes ex vivo into liver cells in 
culture. These vectors provide efficient delivery of genes into liver cells, and the 

25 transferred nucleic acids are stably integrated into the chromosomal DNA of the host 
cell. 

A major prerequisite for the use of viruses is to ensure the safety of their use, 
particularly with regard to the possibility of the spread of wild-type virus in the cell 
population. The development of specialized cell lines (termed "packaging cells") which 
30 produce only replication-defective retroviruses has increased the utility of retroviruses 
for gene therapy, and defective retroviruses are well characterized for use in gene 
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transfer for gene therapy purposes (for a review see Miller, A.D. (1990) Blood 76:271). 
Thus, recombinant retrovirus can be constructed in which part of the retroviral coding 
sequence (gag, pol, env) is replaced by a gene of interest rendering the retrovirus 
replication defective. The replication defective retrovirus is then packaged into virions 

5 which can be used to infect a target cell through the use of a helper virus by standard 
techniques. Protocols for producing recombinant retroviruses and for infecting cells in 
vitro or in vivo with such viruses can be found in Current Protocols in Molecular 
Biology , Ausubel, F.M. et al. (eds.) Greene Publishing Associates, (1989), Sections 
9.10-9.14 and other standard laboratory manuals. Examples of suitable retroviruses 

1 0 include pL J, pZEP, pWE and pEM which are well known to those skilled in the art 
Examples of suitable retroviruses include pLJ, pZIP, pWE and pEM which are well 
known to those skilled in the art Examples of suitable packaging virus lines for 
preparing both ecotropic and amphotropic retroviral systems include \|/Crip, \|/Cre, \|/2 
and yAm. 

1 5 Furthermore, it has been shown that it is possible to limit the infection spectrum 

of retroviruses and consequently of retroviral-based vectors, by modifying the viral 
packaging proteins on the surface of the viral particle (see, for example PCT 
publications W093/25234 and WO94/06920). For instance, strategies for the 
modification of the infection spectrum of retroviral vectors include: coupling antibodies 

20 specific for cell surface antigens to the viral env protein (Roux et al. (1989) Proc. Natl 
Acad. Set USA 86:9079-9083; Julan et al. (1992) J. Gen. Tirol 73:3251-3255; and 
GoxAet al. (1983) Virology 163:251-254); or coupling cell surface receptor ligands to 
the viral env proteins (Neda et al. (1991) J. Biol Chem. 266:14143-14146). Coupling 
can be in the form of the chemical cross-linking with a protein or other variety (i.e. 

25 lactose to convert the env protein to an asialoglycoprotein), as well as by generating 
fusion proteins (le. single-chain antibody/ewv fusion proteins). Thus, in a specific 
embodiment of the invention, viral particles con taining a nucleic acid molecule 
con taining a gene of interest operably linked to appropriate regulatory elements, are 
modified for example according to the methods described above, such that they can 

30 specifically target subsets of liver cells. For example, the viral particle can be coated 
with antibodies to surface molecule that are specific to certain types of liver cells. This 
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method is particularly useful when only specific subsets of liver cells are desired to be 
transfected. 

Another viral gene delivery system useful in the present invention utilizes 
adenovirus-derived vectors. The genome of an adenovirus can be manipulated such that 
5 it encodes and expresses a gene product of interest but is inactivated in terms of its 
ability to replicate in a normal lytic viral life cycle. See for example Berkner et al 
(1988) Biotechniques 6:616; Rosenfeld et al (1991) Science 252:431-434; and 
Rosenfeld et al (1992) Cell 68:143-155. Suitable adenoviral vectors derived from the 
adenovirus strain Ad type 5 dl324 or other strains of adenovirus (z.e., Ad2, Ad3, Ad7 

1 0 etc.) are well known to those skilled in the art. Recombinant adenoviruses can be 
advantageous in certain circumstances in that they are not capable of infecting 
nondividing cells. Furthermore, the virus particle is relatively stable and amenable to 
purification and concentration, and as above, can be modified so as to affect the 
spectrum of infectivity . Additionally, introduced adenoviral DNA (and foreign DNA 

1 5 contained therein) is not integrated into the genome of a host cell but remains episomal, 
thereby avoiding potential problems that can occur as a result of insertional mutagenesis 
in situations where introduced DNA becomes integrated into the host genome (i.e>, 
retroviral DNA). Moreover, the carrying capacity of the adenoviral genome for foreign 
DNA is large (up to 8 kilobases) relative to other gene delivery vectors (Berkner et al 

20 cited supra; Haj-Ahmand and Graham (1986) J. Virol 57:267). Most replication- 
defective adenoviral vectors currently in use and therefore favored by the present 
invention are deleted for all or parts of the viral El and E3 genes but retain as much as 
80 % of the adenoviral genetic material (see, i.e., Jones et al (1979) Cell 16:683; 
Berkner et al, supra; and Graham et al in Methods in Molecular Biology , E. J. Murray, 

25 Ed. (Humana, Clifton, NJ, 1991) vol. 7. pp. 109-127). Expression of the gene of interest 
comprised in the nucleic acid molecule can be under control of, for example, the El A 
promoter, the major late promoter (MLP) and associated leader sequences, the E3 
promoter, or exogenously added promoter sequences. 

Yet another viral vector system useful for delivery of a nucleic acid molecule 

30 comprising a gene of interest is the adeno-associated virus (AAV). Adeno-associated 
virus is a naturally occurring defective virus that requires another virus, such as an 
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adenovirus or a herpes virus, as a helper virus for efficient replication and a productive 
life cycle. (For a review see Muzycrica et al. Curr. Topics Microbiol. Immunol. (1992) 
158:97-129). Adeno-associated viruses exhibit a high frequency of stable integration 
(see for example Flotte et al. (1992) Am. J. Respir. Cell. Mol. Biol. 7:349-356; Samulski 

5 et al. (1989) J. Virol. 63:3822-3828; and McLaughlin et al. (1989) J. Tirol. 62:1963- 
1973). Vectors containing as few as 300 base pairs of AAV can be packaged and can 
integrate. Space for exogenous DNA is limited to about 4.5 kb. An AAV vector such as 
that described in Tratschin et al. (1985) Mol. Cell. Biol. 5:3251-3260 can be used to 
introduce DNA into T cells. A variety of nucleic acids have been introduced into 

1 0 different cell types using AAV vectors (see for example Hermonat et al. (1 984) Proc. 
Natl. Acad. Sci. USA 81:6466-6470; Tratschin etal. (1985) Mol. Cell. Biol. 4:2072- 
2081; Wondisford et al (1988) Mol. Endocrinol. 2:32-39; Tratschin et al (1984) J. 
Virol. 51:611-619; and Flotte etal. (1993) j: Biol. Chem. 268:3781-3790). Other viral 
vector systems that may have application in gene therapy have been derived from herpes 

15 virus, vaccinia virus, and several RNA viruses. . 

Still another viral vector system useful for delivery of a nucleic acid molecule • 
comprising a gene of interest include the Herpes simplex virus type 1 (HSV-1) amplicon 
vectors for transfer of a gene into muscle (Wang, Y. et al. (2002) Hum. Gene. Ther. 
13(2):261-273); 

20 Other methods relating to the use of viral vectors in gene therapy can be found 

in, i.e., Kay, MA. (1997) Chest 111(6 Supp.):138S-142S; Ferry, N. and Heard, J. M. 

(1998) Hum. Gene Ther. 9:1975-81; Sbiratory, Y. et al (1999) Liver 19:265-74; Oka, K. 

et al. (2000) Curr. Opin. Lipidol. 1 1 :179-86; Thule, P.M. and Liu, J.M. (2000) Gene 

Ther. 7:1744-52; Yang,N.S. (1992) Crit. Rev. Biotechnol. 12:335-56; Alt, M. (1995) J. 
25 Hepatol. 23:746-58; Brody, S. L. and Crystal, R. G. (1994) Ann. N. Y. Acad Sci. 71 6:90- 

101; Strayer, D. S. (1999) Expert Opin. Invetig. Drugs 8:2159-2172; Smith-Arica, J. R. 

and Bartlett, J. S. (2001) Curr. Cardiol. Rep. 3:43-49; and Lee, H. C. et al. (2000) 

Nature 408:483-8. 
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The pharmaceutical compositions can be included in a container, pack, or 
dispenser together with instructions for administration. 

VII. Uses and Methods of the Invention 

5 The nucleic acid molecules, polypeptides, polypeptide homologues, modulators, 

and antibodies described herein can be used in methods of treatment as well as drug 
screening assays. A PGC-la protein of the invention has one or more of the activities 
described herein and can thus be used to, for example, modulate mitochondrial 
biogenesis and/or type I muscle formation. The isolated nucleic acid molecules of the 

1 0 invention can be used to express PGC-1 a protein (f. e. , via a recombinant expression 
vector in a host cell in gene therapy applications), to detect PGC-la mRNA (i.e., in a 
biological sample) or a genetic lesion in a PGC-la gene, and to modulate PGC-la 
activity, as described further below. In addition, the PGC-la proteins can be used to 
screen drugs or compounds which modulate PGC-la protein activity as well as to treat 

1 5 disorders characterized by insufficient excessive production of PGC-la protein or 
production of PGC- 1 a protein forms which have increased or decreased activity 
compared to wild type PGC-la. Moreover, the anti- PGC-la antibodies of the invention 
can be used to detect and isolate PGC-la protein and modulate PGC-la protein activity. 
The invention provides a method (also referred to herein as a "screening assay") 

20 for identifying modulators, L e. , candidate or test compounds or agents (i. e. , peptides, 
peptidomimetics, small molecules or other drugs) which bind to PQC-la proteins, have 
a stimulatory or inhibitory effect on, for example, PGC-la expression or PGC-la 
activity, or have a stimulatory or inhibitory effect on, for example, the expression or 
activity of a PGC-la target molecule. 

25 In one embodiment, the invention provides assays for screening candidate or test 

compounds which are target molecules of a PGC-la protein or polypeptide or 
biologically active portion thereof. In another embodiment, the invention provides 
assays for screening candidate or test compounds which bind to or modulate the activity 
of a PGC- 1 a protein or polypeptide or biologically active portion thereof. The test 

30 compounds of the present invention can be obtained using any of the numerous 

approaches in combinatorial library methods known in the art, including: biological 
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libraries; spatially addressable parallel solid phase or solution phase libraries; synthetic 
library methods requiring deconvolution; the 'one-bead one-compound' library method; 
and synthetic library methods using affinity chromatography selection. The biological 
library approach is limited to peptide libraries, while the other four approaches are 
5 applicable to peptide, non-peptide oligomer or small molecule libraries of compounds 
(Lam, K. S. (1997) Anticancer Drug Des. 12:45). 

Examples of methods for the synthesis of molecular libraries can be found in the 
art, for example, in: DeWitt et al. (1993) Proc. Natl Acad Set USA 90:6909; Erb et al 
(1994) Proa Natl Acad Sci. USA 91:11422; Zuckermann et al (1994). J. Med Chem. 

10 37:2678; Cho et al (1993) Science 261:1303; Carrell etal (1994) Angew. Chem. Int 
Ed Engl 33:2059; Carell et al (1994) Angew. Chem. Int Ed Engl 33:2061; and Gallop 
etal (1994) J. Med Chem. 37:1233. 

Libraries of compounds may be presented in solution (i.e., Houghten (1992) 
Biotechniques 13:412-421), or on beads (Lam (1991) Nature 354:82-84), chips (Fodor 

15 (1993) Nature 364:555-556), bacteria (Ladner USP 5,223,409), spores (Ladner USP 
4 409), plasmids (Cull et al (1992) Proc. Natl Acad Sci. USA 89:1865-1869) or on 
phage (Scott and Smith (1990) Science 249:386-390); (Devlin (1990) Science 249:404- 
406); (Cwirla et al (1990) Proc. Natl Acad Sci. USA 87:6378-6382); (Felici (1991) J. 
Mol Biol 222:301-3 1 0); (Ladner supra.). 

20 In one embodiment, an assay is a cell-based assay in which a cell which 

expresses a PGC-la protein or biologically active portion thereof is contacted with a test 
compound and the ability of the test compound to modulate PGC-la activity is 
determined. Determining the ability of the test compound to modulate PGC-la activity 
can be accomplished by monitoring, for example, PEPCK, glucose-6-phosphatase, 

25 and/or fructose-1 ,6-bisphosphatase expression; and/or glucose release into the culture 
medium in a cell which expresses PGC-la. The cell, for example, can be of mammalian 
origin, le. , an Fao hepatoma cell. 

The ability of the test compound to modulate PGC-la binding to a target 
molecule can also be determined. Determining the ability of the test compound to 

30 modulate PGC-la binding to a target molecule can be accomplished, for example, by 
coupling the PGC-la target molecule with a radioisotope or enzymatic label such that 
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binding of the PGC-la target molecule to PGC-la can be determined by detecting the 
labeled PGC-la target molecule in a complex. Alternatively, PGC-la could be coupled 
with a radioisotope or enrymatic label to monitor the ability of a test compound to 
modulate PGC-la binding to a PGC-la target molecule in a complex. Determining the 

5 ability of the test compound to bind PGC-la can be accomplished, for example, by 
coupling the compound with a radioisotope or enzymatic label such that binding of the 
compound to PGC-la can be determined by detecting the labeled PGC-la compound in 
a complex. For example, compounds (i.e., PGC-la target molecules) can be labeled 
with 125 1, 35 S, 14 C, or 3 H, either directly or indirectly, and the radioisotope detected by 

10 direct counting of radioemmission or by scintiUation counting. Alternatively, 

compounds can be enzymatically labeled with, for example, horseradish peroxidase, 
alkaline phosphatase, or luciferase, and the enzymatic label detected by determination of 
conversion of an appropriate substrate to product. 

It is also within the scope of this invention to determine the ability of a 

15 compound or target molecule to interact with PGC-la without the labeling of any of the 
interactants. For example, a microphysiometer can be used to detect the interaction of a 
compound with PGC-la without the labeling of either the compound or the PGC-la. 
McConnell,H.M.efa/.(1992) Science 257:1906-1912. As used herein, a 
4t microphysiometer" (i.e., Cytosensor) is an analytical instrument that measures the rate 

20 at which a cell acidifies its environment using a light-addressable potentiometric sensor 
(LAPS), Changes in this acidification rate can be used as an indicator of the interaction 
between a compound and PGC- la. 

In another embodiment, an assay is a cell-based assay comprising contacting a 
cell expressing a PGC-la target molecule with a test compound and determining the 

25 ability of the test compound to modulate (i.e. stimulate or inhibit) the activity of the 
PGC-la target molecule. Determining the ability of the test compound to modulate the 
activity of a PGC-la target molecule can be accomplished, for example, by determining 
the ability of a PGC-la protein to bind to or interact with the PGC-la target molecule, 
or by determining the ability of a PGC-1 a protein to induce expression from a reporter 

30 construct 
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Determining the ability of the PGC-la protein, or a biologically active fragment 
thereof, to bind to or interact with a PGC-la target molecule can be accomplished by 
one of the methods described above for determining direct binding. In a preferred 
embodiment, determining the ability of the PGC-la protein to bind to or interact with a 

5 PGC-la target molecule can be accomplished by determining the activity of the target 
molecule. For example, the activity of the target molecule can be determined by 
detecting induction of a cellular response expression of type I muscle specific 
genes or mitochondrial specific genes), detecting catalytic/enzymatic activity of the 
target molecule upon an appropriate substrate, detecting the induction of a reporter gene 

1 0 (comprising a target-responsive regulatory element operatively linked to a nucleic acid 
encoding a detectable marker, z.e., luciferase), or detecting a target-regulated cellular 
response (ie., differentiation into type I muscle or resistance to response to induced 
fatigue). 

In yet another embodiment, an assay of the present invention is a cell-free assay 

15 in which a PGC- 1 a protein or biologically active portion thereof is contacted with a test 
compound and the ability of the test compound to bind to the PGC-la protein or 
biologically active portion thereof is determined. Preferred biologically active portions 
of the PGC-la proteins to be used in assays of the present invention include fragments 
which participate in interactions with target molecules. Binding of the test compound to 

20 the PGC-la protein can be determined either directly or indirectly as described above. 
In a preferred embodiment, the assay includes contacting the PGC-1 a protein or 
biologically active portion thereof with a known compound which binds PGC-la to 
form an assay mixture, contacting the assay mixture with a test compound, and 
determining the ability of the test compound to interact with a PGC-la protein, wherein 

25 determining the ability of the test compound to interact with a PGC-la protein 

comprises determining the ability of the test compound to preferentially bind to PGC-la 
or biologically active portion thereof as compared to the known compound. 

In another embodiment, the assay is a cell-free assay in which a PGC-la protein 
or biologically active portion thereof is contacted with a test compound and the ability of 

30 the test compound to modulate (i.e., stimulate or inhibit) the activity of the PGC-la 
protein or biologically active portion thereof is determined. Determining the ability of 
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the test compound to modulate the activity of a PGC-la protein can be accomplished, 
for example, by determining the ability of the PGC-la protein to bind to a PGC-la 
target molecule by one of the methods described above for determining direct binding. 
Determining the ability of the PGC-la protein to bind to a PGC-la target molecule can 

5 also be accomplished using a technology such as real-time Biomolecular Interaction 
Analysis (BIA). Sjolander, S. and Urbaniczky, C. (1991) Anal Chem. 63:2338-2345 
and Szabo et al (1995) Curr. Opin. Struct. Biol 5:699-705. As used herein, "BIA" is a 
technology for studying biospecific interactions in real time, without labeling any of the 
interactants (z.e., BIAcore). Changes in the optical phenomenon of surface plasmon 

1 0 resonance (SPR) can be used as an indication of real-time reactions between biological 
molecules. 

In an alternative embodiment, determining the ability of the test compound to 
modulate the activity of a PGC-la protein can be accomplished by determining the 
ability of the PGC-la protein to further modulate the activity of a downstream effector 

15 of a PGC- 1 a target molecule. For example, the activity of the effector molecule on an 
appropriate target can be determined or the binding of the effector to an appropriate 
target can be determined as previously described. 

In yet another embodiment, the cell-free assay involves contacting a PGC-la 
protein or biologically active portion thereof with a known compound which binds the 

20 PGC-la protein (i.e., PPAR(, HNF-4(, FKHR, or the PEPCK promoter) to form an assay 
mixture, contacting the assay mixture with a test compound, and determining the ability 
of the test compound to interact with the PGC-la protein, wherein determining the 
ability of the test compound to interact with the PGC-la protein comprises determining 
the ability of the PGC-la protein to preferentially bind to or modulate the activity of a 

25 PGC-la target molecule. 

In more than one embodiment of the above assay methods of the present 
invention, it may be desirable to immobilize either PGC-la or its target molecule to 
facilitate separation of complexed from uncomplexed forms of one or both of the 
proteins, as well as to accommodate automation of the assay. Binding of a test 

30 compound to a PGC-la protein, or interaction of a PGC-la protein with a target 

molecule in the presence and absence of a candidate compound, can be accomplished in 
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any vessel suitable for containing the reactants. Examples of such vessels include 
microtiter plates, test tubes, and micro-centrifuge tubes. In one embodiment, a fusion 
protein can be provided which adds a domain that allows one or both of the proteins to 
be bound to a matrix. For example, glutathione-S-transferase/ PGC-lct fusion proteins 

5 or glutathione-S-transferase/target fusion proteins can be adsorbed onto glutathione 
sepharose beads (Sigma Chemical, St Louis, MO) or glutathione derivatized 
micrometer plates, which are then combined with the test compound or the test 
compound and either the non-adsorbed target protein or PGC-la protein, and the 
mixture incubated under conditions conducive to complex formation (/.e., at 

1 0 physiological conditions for salt and pH). Following incubation, the beads or microtiter 
plate wells are washed to remove any unbound components, the matrix immobilized in 
the case of beads, complex determined either directly or indirectly, for example, as 
described above. Alternatively, the complexes can be dissociated from the matrix, and 
the level of PGC-la binding or activity determined using standard techniques. 

1 5 Other techniques for immobilizing proteins on matrices can also be used in the 

screening assays of the invention. For example, either a PGC-la protein or a PGC-la 
target molecule can be immobilized utilizing conjugation of biotin and streptavidin. 
Biotinylated PGC-la protein or target molecules can be prepared from biotin-NHS (N- 
hydroxy-succinimide) using techniques known in the art (i.e. , biotinylation kit, Pierce 

20 Chemicals, Rockford, EL), and immobilized in the wells of streptavidin-coated 96 well 
plates (Pierce Chemical). Alternatively, antibodies reactive with PGC-la protein or 
target molecules but which do not interfere with binding of the PGC-la protein to its 
target molecule can be derivatized to the wells of the plate, and unbound target or PGC- 
la protein trapped in the wells by antibody conjugation. Methods for detecting such 

25 complexes, in addition to those described above for the GST-immobilized complexes, 
include immunodetection of complexes using antibodies reactive with the PGC-la 
protein or target molecule, as well as enzyme-linked assays which rely on detecting an 
enzymatic activity associated with the PGC-la protein or target molecule. 

In another embodiment, modulators of PGC-la expression are identified in a 

30 method wherein a cell is contacted with a candidate compound and the expression of 
PGC-la mRNA or protein in the cell is determined. The level of expression of PGC-la 
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mRNA or protein in the presence of the candidate compound is compared to the level of 
expression of PGC-la mRNA or protein in the absence of the candidate compound The 
candidate compound can then be identified as a modulator of PGC-la expression based 
on this comparison. For example, when expression of PGC-la mRNA or protein is 

5 greater (statistically significantly greater) in the presence of the candidate compound 
than in its absence, the candidate compound is identified as a stimulator of PGC-la 
mRNA or protein expression. Alternatively, when expression of PGC-la mRNA or 
protein is less (statistically significantly less) in the presence of the candidate compound 
than in its absence, the candidate compound is identified as an inhibitor of PGC-la 

10 mRNA or protein expression. The level of PGC-la mRNA or protein expression in the 
cells can be determined by methods described herein for detecting PGC-la mRNA or 
protein. 

In yet another aspect of the invention, the PGC-la proteins can be used as "bait 
proteins" in a two-hybrid assay or three-hybrid assay (see, i.e. , U.S. Patent No. 

1 5 533,3 17; Zervos et al (1993) Cell 72:223-232; Madura et al (1993) J. Biol Chem. 
268:12046-12054; Bartel et al. (1993) Biotechniques 14:920-924; Iwabuchi etal 
(1993) Oncogene 8:1693-1696; and Brent WO94/10300) to identify other proteins 
which bind to or interact with PGC-la ("PGC-la -binding proteins" or "PGC-la -bp") 
and are involved in PGC-la activity. Such PGC-la -binding proteins are also likely to 

20 be involved in the propagation of signals by the PGC-la proteins or PGC-la targets as, 
for example, downstream elements of a PGC-la -mediated signaling pathway. 
Alternatively, such PGC-la -binding proteins may be PGC-la inhibitors. 

The two-hybrid system is based on the modular nature of most transcription 
factors, which consist of separable DNA-binding and activation domains. Briefly, the 

25 assay utilizes two different DNA constructs. In one construct, the gene that codes for a 
PGC-la protein is fused to a gene encoding the DNA binding domain of a known 
transcription factor (i.e. , GAL-4). In the other construct, a DNA sequence, from a 
library of DNA sequences, that encodes an unidentified protein ("prey" or "sample") is 
fused to a gene that codes for the activation domain of the known transcription factor. If 

30 the "baif * and the "prey" proteins are able to interact, in vivo, forming a PGC-la - 

dependent complex, the DNA-binding and activation domains of the transcription factor 
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are brought into close proximity. This proximity allows transcription of a reporter gene 
(£e., LacZ) which is operably linked to a transcriptional regulatory site responsive to the 
transcription factor. Expression of the reporter gene can be detected and cell colonies 
containing the functional transcription factor can be isolated and used to obtain the 

5 cloned gene which encodes the protein which interacts with the PGC-1 a protein. 

In another aspect, the invention pertains to a combination of two or more of the 
assays described herein. For example, a modulating agent can be identified using a cell- 
based or a cell-free assay, and the ability of the agent to modulate the activity of a PGC- 
la protein can be confirmed in vivo, le. 9 in an animal such as a mouse transgenic for 

10 PGC-1 a, particularly wherein the PGC-la is expressed in the muscle. Compounds can 
also be tested in wild-type mice for the ability to increase type I muscle fiber formation. 
Other animals useful in the methods of the invention include those with heart failure, 
disuse atrophy, mitochondrial myopathies, and/or systemic metabolic diseases. 

This invention further pertains to novel agents identified by the above-described 

1 5 screening assays. Accordingly, it is within the scope of this invention to further use an 
agent identified as described herein in an appropriate animal model. For example, an 
agent identified as described herein (i.e. 9 a PGC-la modulating agent, an antisense PGC- 
la nucleic acid molecule, a PGC-la -specific antibody, or a PGC-la binding partner) 
can be used in an animal model to determine the efficacy, toxicity, or side effects of 

20 treatment with such an agent. Alternatively, an agent identified as described herein can 
be used in an animal model to determine the mechanism of action of such an agent 
Furthermore, this invention pertains to uses of novel agents identified by the above- 
described screening assays for treatments as described herein. 

In yet another embodiment, the invention provides a method for identifying a 

25 compound (/. e. , a screening assay) capable of use in the treatment of a disorder 
characterized by (or associated with) aberrant or abnormal PGC-la nucleic acid 
expression or PGC-la polypeptide activity. This method typically includes the step of 
assaying the ability of the compound or agent to modulate the expression of the PGC-la 
nucleic acid or the activity of the PGC-la protein thereby identifying a compound for 

30 treating a disorder characterized by aberrant or abnormal PGC-la nucleic acid 

expression or PGC-la polypeptide activity. Disorders characterized by aberrant or 
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abnormal PGC-la nucleic acid expression or PGC-la protein activity are described 
herein. Methods for assaying the ability of the compound or agent to modulate the 
expression of the PGC-la nucleic acid or activity of the PGC-la protein are typically 
cell-based assays. For example, cells which are sensitive to ligands which transduce 

5 signals via a pathway involving PGC-la can be induced to overexpress a PGC-la 
protein in the presence and absence of a candidate compound. Candidate compounds 
which produce a statistically significant change in PGC-la -dependent responses (either 
stimulation or inhibition) can be identified. In one embodiment, expression of the PGC- 
la nucleic acid or activity of a PGC-la protein is modulated in cells and the effects of 

1 0 candidate compounds on the readout of interest (such as rate of cell proliferation or 
differentiation) are measured. For example, the expression of genes which are up- or 
down-regulated in response to a PGC-la protein-dependent signal cascade (i.e., 
myoglobin, troponin I slow, troponin I fast, MCAD, COX II, COX IV, and/or 
cytochrome c) can be assayed. In preferred embodiments, the regulatory regions of such 

15 genes, ie., the 5' flanking promoter and enhancer regions, are operably linked to a 

detectable marker (such as luciferase) which encodes a gene product that can be readily 
detected. Phosphorylation of PGC-la or PGC-la target molecules can also be 
measured, for example, by immunoblotting. 

Alternatively, modulators of PGC-la nucleic acid expression (z.e., compounds 

20 which can be used to treat a disorder characterized by aberrant or abnormal PGC-la 
nucleic acid expression or PGC-la protein activity) can be identified in a method 
wherein a cell is contacted with a candidate compound and the expression of PGC-la 
mRNA or protein in the cell is determined. The level of expression of PGC-la mRNA 
or protein in the presence of the candidate compound is compared to the level of 

25 expression of PGC-la mRNA or protein in the absence of the candidate compound. The 
candidate compound can then be identified as a modulator of PGC-la nucleic acid 
expression based on this comparison and be used to treat a disorder characterized by 
aberrant PGC-la nucleic acid expression. For example, when expression of PGC-la 
mRNA or polypeptide is greater (statistically significantly greater) in the presence of the 

30 candidate compound than in its absence, the candidate compound is identified as a 

stimulator of PGC-la nucleic acid expression. Alternatively, when PGC-1 a nucleic acid 

i 
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expression is less (statistically significantly less) in the presence of the candidate 
compound than in its absence, the candidate compound is identified as an inhibitor of 
PGC-la nucleic acid expression. The level of PGC-la nucleic acid expression in the 
cells can be determined by methods described herein for detecting PGC-la mRNA or 
5 protein. 

Modulators of PGC-la protein activity and/or PGC-la nucleic acid expression 
identified according to these drug screening assays can be used to treat, for example, 
type I muscle associated disorders such as heart failure, disuse atrophy, mitochondrial 
myopathies, and/or systemic metabolic diseases. Modulators of PGC-la protein activity 

10 and/or PGC-la nucleic acid expression may also be used to treat disorders related to 
other functions of PGC-la unrelated to type I muscle formation. These methods of 
treatment include the steps of administering the modulators of PGC-la protein activity 
and/or nucleic acid expression, i.e., in a pharmaceutical composition as described in 
subsection TV above, to a subject in need of such treatment, i.e. 9 a subject with a disorder 

.15 described herein. 

This invention is further illustrated by the following examples which should not 
be construed as limiting. The contents of all references, patent applications, patents, and 
published patent applications, as well as the Figures and the Sequence Listing cited 
20 throughout this application are hereby incorporated by reference. 

EXAMPLES 

EXAMPLE 1: PGC-la IS PREFERENTIALLY EXPRESSED IN SLOW 
25 TWITCH MUSCLE FIBERS 

This example describes the investigation of PGC-la expression levels in muscle. 
RNA was extracted from various types of mouse muscle using standard methods and 
subjected to a standard Northern blotting protocol using a PGC-la probe. High levels of 
PGC-la mRNA expression were seen in soleus (slow-twitch muscles). Extensor 
30 digitorum longus (EDL), quadriceps, gastrocnemius, and tibialis anterior (TA) muscles 
(all fast-twitch muscles) all showed low-level expression. PGC-1( expression was also 
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examined in soleus, EDL, quadriceps, and TA muscles. Moderate expression was seen 
in all of these muscle types. 

EXAMPLE 2: INDUCTION OF SLOW -TWITCH MUSCLE FIBER 
5 DIFFERENTIATION BY TRANSGENIC OVEREXPRESSION OF PGC-la 

This example describes the results of overexpression of PGC- 1 a in transgenic 
mice. The PGC-la cDNA sequence was placed under the control of a muscle-specific 
promoter (the muscle creatine kinase (MCK) promoter). The muscle creatine kinase 
promoter is expressed in both type I and type II muscle fibers, but is enriched in type II 

1 0 muscle. Transgenic mice were generated using DNA microinjection and screened by 
PCR. Four independent founder lines were obtained (#23, #26, #29, #3 1) and mated 
with wild type mice to obtain progeny for use in experiments. Transgenic lines #23 and 
#31 showed strong PGC-la mRNA expression. Line #26 showed low-level PGC-la 
mRNA expression. Line #29 showed little PGC-la mRNA expression. Western 

1 5 blotting showed that PGC- 1 a protein expression levels were not increased in the soleus 
(type I) muscles of the high expressing transgenic mice. While no expression of PGC- 
1 a protein is normally seen in plantaris muscles in non-transgenic mice, PGC- 1 a protein 
is expressed in the plantaris muscles of the high-expressing transgenic mice at the same 
level as in soleus muscle. 

20 mRNA was extracted from the muscle fibers of the transgenic lines and subjected 

to Northern blotting. Transgenic PGC-la expression resulted in enhanced expression of 
markers indicative of mitochondrial biogenesis. These markers include medium chain 
Acyl CoA dehydrogenase (MCAD), cytochrome c oxidase II (COX II), cytochrome c 
oxidase IV (COX IV), and cytochrome c. Transgenic PGC-la expression also resulted 

25 in a dose-dependant decrease in the expression of a type II (fast-twitch) fiber specific 
marker (troponin I fast) and a dose-dependant increase in the expression of type I (slow- 
twitch) fiber specific markers (myoglobin, troponin I slow) in otherwise fast-twitch (type 
II) fibers, indicating a switch of differentiation program toward type I fibers in the 
presence of PGC-la. Western blotting also indicated a switch in the differentiation 

30 program toward type I fibers in the presence of PGC-la. The vastus muscle isolated 
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from the transgenic mice showed strong increases in the levels of myosin, troponin, and 
cytochrome c (a mitochondrial marker) protein. 

Macroscopic examination of the skeletal muscles of the transgenic mice 
indicated a greatly increased amount of dark-colored (type I) muscle throughout the 

5 entire bodies of the mice, as compared to wild type. Specific examination of the hind- 
limb muscles further showed that the muscles were much darker than the same muscles 
in the wild type mice. Specific side-by-side examination of the soleus and 
gastrocnemius muscles showed that the gastrocnemius muscle (normally a type II 
muscle) was the same dark color in the transgenic mice as the soleus muscle. 

10 Metachromatic and anti-myosin histological analysis of plantaris muscle confirmed that 
the number of type I fibers is significantly increased in the transgenic mice, as compared 
to their wild type littennate controls. 

The muscles from the transgenic mice were also tested for type I specific 
functional properties. EDL muscles were isolated and subjected to eletrostimulation, 

15 and the force generated by the muscles was measured. Fatigue was defined as the time 
point when the force generated dropped to 30% of the initial force generated. This assay 
mimics die effects of exercise on the muscles. Using this assay, the EDL muscles 
isolated from the transgenic mice are significantly more resistant to exercise-induced 
fatigue (P < 0.05), a hallmark for slow-twitch muscle fibers and muscles following 

20 endurance training. 

EXAMPLE 3: AUTOREGULATORY LOOP CONTROLS PGC-la 

EXPRESSION IN SKELETAL MUSCLE 
Skeletal muscle contains muscle fibers that differ greatly in their oxidative 
25 capacity. Prolonged electrical stimulation or exercise training can lead to a muscle fiber 
type conversion of type II (fast-twitch) to type I (slow-twitch) fibers (Booth, F. W., and 
Thomason, D. B. (1991) Physiol Rev 71, 541-585). Conversely, physical inactivity or 
denervation can cause a switch to type II fibers (Booth, F. W., and Thomason, D. B. 
(1991) Physiol Rev 71, 541-585). The conversion to type I fibers is characterized by a 
30 dramatic change in expression of a large number of genes that increase the oxidative 

capacity and number of mitochondria, as well as synthesis of distinct contractile proteins 

69 



WO 03/068944 



PCT7US03/04792 



characteristic of this muscle fiber type (Berchtold, M. W., et al (2000) Physiol Rev 80, 
1215-1265). Exercise training is accompanied by an increase in motor nerve activity 
that subsequently elevates intracellular calcium levels in the muscle (Olson, E. N., and 
Williams, R. S. (2000) Bioassays 22, 510-519; Hood, D. A. (2001) J Appl Physiol 90, 
5 1137-11 57). Calcium and the calcium-binding protein calmodulin activate both the 

calcium/calmodulin-dependent protein kinase IV (CaMKTV) and the protein phosphatase 
calcineurin A (CnA) as well as many other factors (Hood, D. A. (2001) J Appl Physiol 
90, 1137-11 57). Activated CaMKTV catalyzes protein phosphorylation events that result 
in release of the myocyte enhancer factor 2 (MEF2) transcription factors from a complex 

10 including the histone deacetylases HDAC1/2 and HDAC4/5, the repressor Cabin-1 and 
the adaptor mSin3 (Corcoran, E. E., and Means, A. R. (2001) J Biol Chem 276, 2975- 
2978). Upon phosphorylation by CaMKTV, these factors are bound to 14-3-3 proteins 
and exported from the nucleus; as a consequence, the MEF2s are now transcriptionally 
active and can bind co-activator proteins including CBP/p300 or PGC-la (McKinsey, T. 

15 A., et al (2001) Curr Opin Genet Devil, 497-504; McKinsey, T. A., et al (2002) 
Trends Biochem Sci 27, 40-47; Michael, L. F., et al (2001) Proc Natl Acad Sci USA 
98, 3820-3825). 

In another arm of the calcium signaling pathway, activated CnA 
dephosphorylates members of the nuclear factor of activated T-cells (NFAT) family, 

20 thereby stimulating a cytoplasmic-nuclear translocation of these proteins (Olson, E. N., 
and Williams, R. S. (2000) Cell 101, 689-692). The combined action of MEF2s and 
NFATs in the nucleus increases the transcription of prototypical muscle fiber type I 
genes, and thus promotes muscle fiber type switching from type II to type I (Chin, E. R., 
et al. (1998) Genes Dev 12, 2499-2509). Activated CnA provides a further boost to this 

25 process by dephosphorylating MEF and enhancing its transcriptional activity (Wu, H., et 
al (2001) EMBO J20, 6414-6423). 

Proof of concept of this general model came from transgenic mice that express 
either constitutively active CnA or CaMKIV, respectively (Naya, F. J., et al (2000) J 
Biol Chem 275, 4545-4548; Wu, H., et al (2002) Science 296, 349-352). In these mice, 

30 the relative amount of type I muscle fibers is greatly increased in comparison to wildtype 
animals, supporting a crucial role for CnA and CaMKIV in muscle fiber type 
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determination. They are furthermore characterized by enhanced mitochondrial 
biogenesis, upregulation of enzymes involved in oxidative metabolism and greater 
resistance to fatigue (Wu, H. et al (2002) Science 296, 349-352). Interestingly, the main 
effect of CaMKIV and CnA was observed in an increase of type I muscle fiber number, 
5 but not skeletal muscle hypertrophy. Although CnA has been implicated in the 

molecular mechanism that stimulates hypertrophy, these animal models demonstrate that 
slow fiber type determination and muscular hypertrophy can be separated and depend on 
the cellular stimuli and context (Naya, F. J., et al (2000) J Biol Chem 275, 4545-4548; 
Musaro, A., et al (1999) Nature 400, 581-585). 

1 0 PGC-la was originally cloned from brown adipose tissue and has been shown to 

coactivate a variety of nuclear receptors and other transcription factors (described in U.S. 
Patent No. 6, 1 66, 1 92, incorporated herein in its entirety by reference). Moreover, PGC- 
la is a potent stimulator of mitochondrial biogenesis and oxidative metabolism in 
several tissues including skeletal muscle (Michael, L. F., et al (2001) Proc Natl Acad 

15 Sci USA 98,3820-3825; Wu,Z., etal (1999) Cell 98, 115-124). These aspects of 
energy metabolism are crucial in muscle fiber type differentiation, and thus, as set forth 
herein, transgenic expression of PGC-la driven by a muscle-specific promoter results in 
a dramatic increase of type I muscle fibers. Increased expression of fiber type I proteins, 
higher oxidative capacity and greater resistance to fatigue can be observed in the mice 

20 that ectopically express PGC-la. It has also been found that PGC-la may regulate its 
own transcription and with this autoregulatory loop helps to maintain expression of fiber 
type I-specific genes. 

METHODS AND MATERIALS 

25 Plasmids and reagents. 

The 5 '-flanking sequence of mouse PGC-la was obtained from the CELERA™ 
Mouse Genome database. Various fragments of this promoter were subsequently 
amplified by PCR and subcloned into the pGL3basic reporter gene vector 
(PROMEGA™). Thus, the constructs containing the regions between +78 and -2533 or - 
30 6483 in respect to the transcriptional start site were denominated 2 kb and 6 kb, 
respectively. All constructs were verified by sequencing. Expression plasmids for 
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MEF2C, MEF2D, NFATc3, CaMKTV and constitutively active CnA were gifts from Dr. 
Eric N. Olson, University of Texas Southwestern Medical Center, Dallas, TX. The 
dominant negative cyclic AMP response element binding protein (CREB) called 
ACREB was provided by Dr. Charles Vinson, National Cancer Institute, National 
5 Institutes of Health, Bethesda, MD. All reagents were obtained from SIGMA™. 

Site-directed mutagenesis. 

Site-directed mutagenesis was performed as described previously (Handschin, C, 
and Meyer, U. A. (2000) J Biol Chem 275, 13362-13369). Briefly, PCR amplifications 

1 0 were performed by using overlapping primers at the target sites, the resulting PCR 
product was digested with Dpnl to remove residual template and subsequently 
transformed into bacteria Clones containing the mutation were digested with Kpnl and 
BglU and the insert subcloned into a new reporter gene vector. The cAMP-responsive 
element (CRE) and the MEF-binding site were mutated into a BglR and a SacU site and 

15 termed ACRE and AMEF2, respectively. Constructs were verified by both restriction 
digestion and sequencing. 

Cell culture, transfection and reporter gene assays. 

C2C12 cells were maintained in DMEM supplemented with 10% fetal calf serum 

20 and 1 |xM Na-pyruvate in subconfluent cultures. Cells were subsequently transfected 
using Lipofectamine transfection reagent (INVITROGEN™) and reporter gene levels 
were determined 48 hours after transfection. Cells were lysed and analyzed for luciferase 
expression using the Enhanced Luciferase Assay Kit (BD PHARMINGEN™) according 
to the supplier's manual. Reporter gene expressions were subsequently normalized 

25 against P-galactosidase levels driven by the cotransfected pSV- P-galactosidase 

expression vector (PROMEGA™). Finally, these relative expression were normalized 
against empty reporter gene vector expression. 
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Analysis of PGC-la gene expression in wildtype and transgenic PGC-la mice, 

Wildtype and transgenic mice from strain #3 1 (described above in Example 2) 
that highly express PGC-la in muscle were sacrificed, skeletal muscle was collected, 
total RNA isolated using the Trizol reagent following the manufacturer's instructions 
5 and subsequently reverse transcribed. Primers for the ABI Prism 7700 sequence detector 
(APPLIED BIOSYSTEMS™) were designed with the Primer Express ™ software 
targeting either PGC-la exon 2, PGC-la 3' untranslated region, mouse cytochrome c, 
uncoupling protein 3, myoglobin, glyceraldehyde 3-phosphate dehydrogenase and 1 8S 
rRNA. Using the SYBR green PCR master mix, expression levels of total PGC-la 
10 (primers for exon 2) and endogenous PGC-la (primers for the 3' untranslated region) as 
well as of the other genes were determined from at least three wildtype and transgenic 
mice and subsequently normalized against 18S rRNA levels. 

RESULTS 

PGC-la has been shown to be elevated in the skeletal muscles of mice that 
'contain CaMKIV expressed transgenically in this tissue (Wu, H., et al (2002) Science 
296, 349-352). Furthermore, CaMKIV was found to activate the human PGC-la 
promoter but the mechanistic basis to this has not been investigated (Wu, H., et al 
(2002) Science 296, 349-352). As depicted in Figure 1 A, proximal promoter fragments 
that are 2 kb or 6 kb in size are both activated when co-transfected with a vector 
expressing a constitutively active CaMKIV. Coexpression of a constitutively active 
CnA has only a minimal effect on reporter gene levels corroborating the results obtained 
in the transgenic CaMKIV and CnA models, respectively (Wu, H., et al (2002) Science 
296, 349-352). The combination of CaMKIV and CnA has at least an additive effect in 
increasing transcription controlled by the PGC-la promoter. In this experiment, C2C12 
cells were cotransfected with expression plasmids for CnA, CaMKIV and ACREB 
together with reporter gene plasmids containing different fragments of the mouse PGC- 
la promoter. After 48 hours, cells were harvested and reporter gene levels determined. 

CaMKTV has been shown to phosphorylate and activate many proteins including 
CREB. Since CREB has been shown to be an important component of PGC-la 
expression in the fasted liver (Herzig, S., et al (2001) Nature 413, 179-1 83), dominant 
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negative ACREB protein was utilized to examine a potential role for CREB in the 
CaMKIV-mediated control of the PGC-la promoter. While ACREB had no effect on 
the PGC-la promoter when expressed alone, this protein is able to severely reduce the 
activation of the PGC-la promoter by CaMKIV alone or CaMKIV in combination with 

5 CnA (Figure 1 A). 

The human PGC-la promoter contains a CRE at -133/-1 16 that is crucial for 
PGC-la induction by cAMP in the liver (Herzig, S., et al (2001) Nature 413, 179-183). 
Similarly, a very conserved putative CRE can be identified in the mouse PGC-la 
promoter at approximately the same distance from the transcriptional start, site (Figure 

10 IB). The functional role of this mouse CRE was tested by site-directed mutagenesis 

followed by stimulation of the mutated promoter with 100 pM forskolin, a reagent which 
stimulates formation of cAMP, for 10 hours. As shown in Figure 1C, mutagenesis of the 
CRE site abolished induction of 2 kb of the mouse PGC-la promoter by forskolin. The 
same results were obtained when treating the cells with 8-bromo-cAMP whereas the 

15 inactive analog 1,9-dideoxyforskolin had no effect on the PGC-1 a promoter (data not 
* shown). Importantly, the CRE PGC-la promoter showed dramatically impaired 
response to CaMKIV alone or the combination of CaMKIV and CnA in these assays, 
indicating a key role for CREB in the induction of PGC-la expression by these 
mediators of calcium signaling (Figure 1C). Similar observations were made when 

20 using larger fragments of the mouse PGC-la promoter. In this experiment, C2C12 cells 
were cotransfected with expression plasmids for CnA, CaMKIV and ACREB together 
with reporter gene plasmids containing 2 kb of wildtype or PGC-la promoter with a 
mutation in the CRE site (ACRE), respectively. Cells were subsequently treated with 
either vehicle (0.1% DMSO) or 100 pM forskolin for 10 hours and harvested 48 hours 

25 after transfection before reporter gene levels were determined. 

While CREB appears to be an important factor in the induction of PGC-la, the 
increased effect of CaMKIV in combination with CnA indicates that factors in addition 
to CREB are likely to be involved in the transcription of the PGC-la gene in muscle. 
Since MEF2 and NFAT transcription factors are known targets of CaMKIV and CnA in 

30 muscle fiber type determination, the role of these factors in control of the PGC-la 

promoter was tested. As depicted in Figure 2A, MEF2C, MEF2D or NF ATc3 alone did 
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not have a significant effect on the 6 kb PGC-1 a promoter construct. However, since 
MEF2 proteins are known to be coactivated by PGC-la (Michael, L. F., et al (2001) 
Proc Natl Acad Sci USA 98, 3820-3825), these factors were cotransfected and the 
experiments revealed coactivation of both MEF2C and MEF2D but not NFAT by PGC- 
5 1 a (Figure 2A). These data indicate that PGC-1 a participates in the activation of its own 
promoter, and the MEF2 proteins may be upstream as well as downstream of PGC-la 
expression. In this experiment, C2C12 cells were cotransfected with expression 
plasmids for MEF2C, MEF2D, NFAT and PGC-la together with reporter gene plasmids 
containing 6 kb of the mouse PGC-la promoter. After 48 hours, cells were harvested 

1 0 and reporter gene levels determined. 

The transcriptional capacities of both MEF2 and NFAT are known to be increased by 
CaMKIV- and CnA-mediated changes in phosphorylation status. CnA is able to 
substantially increase the activity of MEF2C and MEF2D (Figure 2B). The strongest 
effect on the PGC-la promoter was observed when cotransfecting MEF2C or MEF2D 

15 together with CnA and PGC-la (Figure 2B). No effect was found by the coexpression 
of any of these proteins with NFAT. In contrast to the effects of CnA, the effect of 
CaMKIV on this reporter gene construct was neither changed by addition of MEF2s nor ;* 
PGC-la. This indicates that a major effect of CaMKIV may be in activating PGC-la 
expression via CREB independent on PGC-la coactivation whereas CnA apparently is 

20 able to further increase the potency of MEF2s to stimulate transcription of PGC-la. 

Similarly, in this experiment, C2C12 cells were cotransfected with expression plasmids 
for MEF2C, MEF2D, NFAT, CnA and PGC-la together with reporter gene plasmids 
containing 6 kb of the mouse PGC-la promoter. After 48 hours, cells were harvested 
and reporter gene levels determined. 

25 As depicted in Figure 3A, computer-aided sequence analysis of the mouse PGC-la 5'- 
flanking region revealed a high-scoring MEF2 binding site at -1464/- 1447 (TRANSFAC 
matrix V$AMEF2.01) and a NFAT binding site at -1547/-1536 (TRANSFAC matrix 
VSNFAT.01) (Quandt, K., et al (1995) Nucleic Acids Res 23, 4878-4884). Similar 
configurations of adjacent MEF2 and NFAT binding sites have previously been 

30 described in several muscle fiber type I specific promoters (Chin, E. R., et al (1998) 
Genes Dev 12, 2499-2509). Thus, whether site-directed mutagenesis of this site affects 
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MEF2 activity on the reporter gene construct was tested. The mutated 2 kb fragment 
(referred to as AMEF2) is no longer able to mediate MEF2C or MEF2D induction either 
when activated by CnA or when coactivated with PGC-la, indicating that this site is 
responsible for the MEF2 action (Figure 3B). In this experiment, C2C12 cells were 
5 cotransfected with expression plasmids for MEF2C, MEF2D, CnA and PGC-la together 
with reporter gene plasmids containing 2 kb of wildtype or 2 kb of mouse PGC-la 
promoter with a mutation in the MEF2-binding site (AMEF2). After 48 hours, cells 
were harvested and reporter gene levels determined. 

The ability of PGC-la to stimulate the PGC-la promoter via coactivation of the 

10 MEF2 proteins indicates a potential autoregulatory loop (Figure 4A). Exercise and 
subsequently elevated intracellular calcium levels result in an activation of both 
CaMKIV and CnA in skeletal muscle. Activated CaMKIV can phosphoiylate CREB 
which then increases transcription of PGC-la via a conserved CREB-binding site in the 
proximal promoter. Moreover, CaMKIV and CnA activate the transcriptional activity of 

1 5 MEF2s in part by promoting the dissociation of inhibitory HDACs and Cabinl . MEF2s, 
potentially in combination with NFAT, bind to at least one MEF2-binding site in the 
PGC-la flanking region and increase transcriptional activity. Newly synthesized PGC- 
la protein can coactivate MEF2s and thus positively regulate its own transcription. 
PGC-la may also compete with the inhibitory HDACs and Cabin 1 for binding to 

20 MEF2s and thus ensure a stable transcription leading to muscle fiber type I 
determination. 

Thus, increased levels of PGC-la protein should lead to a stable expression of 
PGC-la by coactivation of MEF2s on its own promoter. In order to critically test this 
hypothesis, real-time PCR primers for the PGC-la 3' untranslated region were designed 

25 that should allow distinct determination of the levels of ectopically expressed and 
endogenous PGC-la. Total RNA from wildtype and transgenic skeletal muscle were 
analyzed for the expression levels of total and endogenous PGC-la mRNA using real- 
time PCR primers targeted for exon 2 (Figure 4B) and the 3' untranslated region (Figure 
4C), respectively. The same RNA was analyzed for the expression of cytochrome c (Cyt 

30 c), uncoupling protein-3 (UCP-3), myoglobin and glyceraldehyde 3-phosphate 
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dehydrogenase (GAPDH) (Figure 4D). Relative mRNA expression levels were 
normalized against 18S rRNA levels. 

As shown in Figure 4B, primers designed to target PGC-la exon 2 reveal a more 
than 80-fold increase in total PGC-la levels in the muscle of transgenic mice in 
5 comparison to wildtype animals. These findings are similar to the results observed in 
Northern blots for the high-expressing transgenic line #31 (Lin, L, et al (2002) Nature 
418, 797-801). When exclusively measuring endogenous PGC-la with primers 
designed for the 3' untranslated region that is missing in the transgenic constructs, an 
approximately 7-fold elevation of endogenous PGC-la was observed in the transgenic 
1 0 animals as compared to wildtype mice (Figure 4C). A robust increase in the transcript 
levels of cytochrome c (Cyt c), uncoupling protein-3 (UCP-3) and myoglobin in the 
RNA isolated from skeletal muscle of the transgenic mice was also shown whereas 
glyceraldehyde 3-phosphate dehydrogenase (GAPDH) levels remained unchanged 
(Figure 4D). 

15 These results strongly indicate a regulation of PGC- 1 a transcription by PGC-1 a 

protein in an autoregulatory loop. Moreover, PGC-la gene expression in muscle is 
reminiscent of other prototypical fiber type I genes such as myoglobin. Thus, gene 
expression analysis of transgenic PGC-la animals further underscores the importance of 
PGC-1 a in its own regulation. 

20 Accordingly, based on these results, it appears that CaMKTV stimulates PGC-la 

expression, namely by phosphorylating and thus activating CREB, a transcription factor 
implicated in PGC-la transcription in many different tissues (Puigserver, P. et al (1998) 
Cell 92, 829-839; Heizig, S., et al (2001) Nature 413, 179-183). 

These data indicate an initial activation of PGC-la transcription by CaMKTV via 

25 CREB (Figure 4A). As soon as PGC-la is expressed, it can act as cofactor for de- 
repressed MEF2 on fiber type I target genes as well as its own promoter, thus ensuring 
stable, high expression levels. Moreover, PGC- let binding to MEF2 may prevent 
binding of the MEF2-repressing HDACs and Cabinl proteins. Although CnA and 
NFAT did not affect the PGC-la promoter on their own, an increase in CaMKTV- and * 

30 MEF2-mediated induction was observed when CnA and NFAT were cotransfected. This 
may be explained by CnA-triggered activation of MEFs and other factors or by 
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stabilization of MEF2-binding to the promoter due to NFAT. Recent reports using in 
vivo models support the methods described herein, such as the data showing rapid 
increase of PGC-la mRNA and protein levels after exercise in rats and man (Goto, M., 
et al (2000) Biochem Biophys Res Commun 274, 350-354; Terada, S., et al (2002) 
5 Biochem Biophys Res Commun 296, 350-354; Baar, K., et al (2002) FASEB J 16, 1879- 
1886). 

Equivalents 

Those skilled in the art will recognize, or be able to ascertain using no more than 
10 routine experimentation, many equivalents to the specific embodiments of the invention 
described herein. Such equivalents are intended to be encompassed by the following 
claims. 
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What is claimed: 



1 . A method for modulating type I muscle formation comprising contacting 
a cell with an agent that modulates PGC-la expression or activity, such that type I 

5 muscle formation is modulated. 

2. The method of claim 1 , wherein PGC-1 a expression or activity is 
increased. 

10 3. The method of claim 1, wherein PGC-la expression or activity is 

decreased. 



4, The method of claim 1 , wherein type I muscle formation is increased. 

15 5. The method of claim 1 , wherein the agent is a PGC-la nucleic acid 

molecule. 

6. The method of claim 5, wherein the PGC-la nucleic acid molecule is 
derived from a human. I 

20 

7. . The method of claim 6, wherein the PGC-1 a nucleic acid molecule 
comprises the nucleic acid sequence of SEQ ID NO:l. 



8. The method of claim 5, wherein the PGC-la nucleic acid molecule is 
25 contained within a vector. 



9. The method of claim 8, wherein the vector is an adenoviral or an adeno- 
associated vector. 

30 10. The method of claim 1, wherein the agent is a PGC-la polypeptide. 
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11. The method of claim 10, wherein the PGC-la polypeptide is derived 
from a human. 

12. The method of claim 1 1, wherein the PGC-la polypeptide comprises the 
5 amino acid sequence of SEQ ID NO:2. 

13. The method of claim 1, wherein the agent is a small molecule. 

14. The method of claim 1, wherein the cell is a muscle cell. 

10 

15. The method of claim 14, wherein the muscle cell is a skeletal muscle cell. 

16. The method of claim 15, wherein the skeletal muscle cell is selected from 
the group consisting of a type I muscle cell and a type II muscle cell. 

15 

17. The method of claim 1 , wherein the method is performed in vitro. 

1 8. The method of claim 1 , wherein the method is performed in vivo. 

20 1 9. The method of claim 1 8, wherein the method is performed in a mouse. 

20. The method of claim 1 8, wherein the method is performed in a human. 

21. A method for identifying a compound capable of modulating type I 
25 muscle formation comprising: 

a) contacting a cell with a compound; and 

b) determining whether PGC-1 a expression or activity is modulated. 

22. The method of claim 21, wherein PGC-la expression or activity' is 
30 increased. 
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23. The method of claim 21, wherein PGC-la expression is measured by 
Northern blotting. 

24. The method of claim 21 , wherein determining whether PGC-la 

5 expression or activity is modulated comprises determining whether expression of at least 
one of myoglobin, troponin I slow, troponin I fast, MCAD, COX II, COX IV, or 
cytochrome c is modulated. 

25. The method of claim 24, wherein expression is measured by Northern 
10 blotting, 

26. The method of claim 2 1 , wherein the cell is a muscle cell. 

27. The method of claim 21 , wherein the muscle cell is a skeletal muscle cell. 

15 

28. The method of claim 27, wherein the skeletal muscle cell is selected from 
the group consisting of: a type I muscle cell and a type II muscle cell. 

29. A compound identified by the method of claim 21 . 

30. A method for identifying a compound capable of treating a disorder 
characterized by aberrant type I muscle formation comprising assaying the ability of the 
compound to modulate the expression or activity of PGC-la to thereby identify a 
compound capable of treating a disorder characterized by aberrant type I muscle 
formation. 

3 1 . The method of claim 3 0, wherein PGC-la expression or activity is 
increased. 

30 32. The method of claim 3 0, wherein PGC-1 a expression is measured by 

Northern blotting. 



20 
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33. The method of claim 30, wherein determining whether PGC-la 
expression or activity is modulated comprises determining whether expression of at least 
one of myoglobin, troponin I slow, troponin I fast, MCAD, COX II, COX IV, or 

5 cytochrome c is modulated. 

34. The method of claim 33, wherein expression is measured by Northern 
blotting. 

10 35. The method of claim 30, wherein the cell is a muscle cell. 

36. The method of claim 35, wherein the muscle cell is a skeletal muscle cell. 

37. The method of claim 36, wherein the skeletal muscle cell is selected from 
15 the group consisting of: a type I muscle cell and a type II muscle cell. 

38. A compound identified by the method of claim 30. 

39. A method for treating a subject having a disorder characterized by 
20 aberrant type I muscle formation comprising administering to the subject an agent 

capable of modulating PGC-la expression or activity, such that the disorder is treated. 

40. The method of claim 39, wherein the disorder is selected from the group 
consisting of heart failure, disuse atrophy, a mitochondrial myopathy, and a systemic 

25 metabolic disorder. 

4 1 . The method of claim 39, wherein PGC-1 a expression or activity is 
increased. 

30 42. The method of claim 41 , wherein type I muscle formation is increased. 
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43. The method of claim 39, wherein the agent is a PGC-la nucleic acid 
molecule. 

44. The method of claim 43, wherein the PGC-la nucleic acid molecule is 
5 derived from a human. 

45. The method of claim 44, wherein the PGC-la nucleic acid molecule 
comprises the nucleic acid sequence of SEQ ID NO : 1 . 

1 0 46. The method of claim 43, wherein the PGC-1 a nucleic acid molecule is 

contained within a vector. 

47. The method of claim 46, wherein the vector is an adenoviral or an adeno- 
associated vector. 

15 

48. The method of claim 39, wherein the agent is a small molecule. 

49. A method for increasing type I muscle formation in a subject comprising 
administering to the subject an agent capable of increasing PGC-la expression or 

20 activity, such that type I muscle formation is increased. 

50. The method of claim 49, wherein the agent is a PGC-1 a nucleic acid 
molecule. 

25 51 . The method of claim 50, wherein the PGC-la nucleic acid molecule is 

derived from a human. 

52. The method of claim 51, wherein the PGC-la nucleic acid molecule 
comprises the nucleic acid sequence of SEQ ID NO:l. 

30 
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53. The method of claim 50, wherein the PGC-lct nucleic acid molecule is 
contained within a vector. 

54. The method of claim 53, wherein the vector is an adenoviral or an adeno- 
5 associated vector. 

55. The method of claim 49, wherein the agent is a small molecule. 

56. A nonhuman transgenic animal comprising an exogenous PGC-la 
10 nucleic acid molecule, wherein the exogenous PGC-la nucleic acid molecule is 

expressed in the skeletal muscle of the animal. 

57. The transgenic animal of claim 56, wherein the exogenous PGC-la 
nucleic acid molecule is operatively linked to a muscle specific promoter. 

15 

58. The transgenic animal of claim 57, wherein the muscle specific promoter 
is selected from the group consisting of: the muscle creatine kinase promoter, the 
dystrophin promoter, the myostatin promoter, the GDF-8 promoter, the UCP-3 promoter, 
the MyoD promoter, the MEF2 the promoter, the myosin heavy chain promoter, the 

20 myosin light chain promoter, and a troponin promoter. 

59. The transgenic animal of claim 57, wherein the expression of at least one 
of myoglobin, troponin I slow, MCAD, COX II, COX IV, or cytochrome c is 
upregulated in the muscle cells of the animal. 

25 

60. The transgenic animal of claim 56, wherein the animal is a mouse. 
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CRE 

B HumanPGC-la - 1 33AGIGACGT£A,GGAGTTTG -1 16 
MousePGC-la -146 AG TGAC6TCA GGAGTTTG - 1 29 
ACRE - 1 46AGTAGATCTA GGAGTTTG - 1 29 
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SEQUENCE LISTING 
<110> Dana Farber Cancer Institute , et al. 

<120> Methods and Compositions For Modulating Type I Muscle 
Formation Using PGC-lalpha 

<130> DFN-041PC 

<140> 
<141> 

<150> 60/357,069 
<151> 2002-02-13 

<160> 20 

<170> FastSEQ for Windows Version 4.0 

<210> 1 
<211> 3023 
<212> DNA 

<213> Homo sapiens 

<220> 
<221> CDS 

<222> (89) . . (2482) 
<400> 1 

caggtggctg gttgcctgca tgagtgtgtg ctctgtgtca ctgtggattg gagttgaaaa 60 

agcttgactg gcgtcattca ggagctgg atg gcg tgg gac atg tgc aac cag 112 

Met Ala Trp Asp Met Cys Asn Gin 
1 5 

gac tct gag tct gta tgg agt gac ate gag tgt get get ctg gtt ggt 160 
Asp Ser Glu Ser Val Trp Ser Asp lie Glu Cys Ala Ala Leu Val Gly 
10 15 ^20 

gaa gac cag cct ctt tgc cca gat ctt cct gaa ctt gat ctt tct gaa 208 
Glu Asp Gin Pro Leu Cys Pro Asp Leu Pro Glu Leu Asp Leu Ser Glu 
25 30 '35 40 

eta gat gtg aac gac ttg gat aca gac age ttt ctg ggt gga etc aag 256 
Leu Asp Val Asn Asp Leu Asp Thr Asp Ser Phe Leu Gly Gly Leu Lys 
45 50 55 

tgg tgc agt gac caa tea gaa ata ata tec aat cag tac aac aat gag 304 
Trp Cys Ser Asp Gin Ser Glu lie lie Ser Asn Gin Tyr Asn Asn Glu 
60 65 70 

cct tea aac ata ttt gag aag ata gat gaa gag aat gag gca aac ttg 352 
Pro Ser Asn lie Phe Glu Lys lie Asp Glu Glu Asn Glu Ala Asn Leu 
75 80 85 

eta gca gtc etc aca gag aca eta gac agt etc cct gtg gat gaa gac 400 
Leu Ala Val Leu Thr Glu Thr Leu Asp Ser Leu Pro Val Asp Glu Asp 
90 95 100 

gga ttg ccc tea ttt gat gcg ctg aca gat gga gac gtg acc act gac 448 
Gly Leu Pro Ser Phe Asp Ala Leu Thr Asp Gly Asp Val Thr Thr Asp 
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2 

105 110 115 120 

aat gag get agt cct tec tec atg cct gac ggc ace cct cca ccc cag 496 
Asn Glu Ala Ser Pro Ser Ser Met Pro Asp Gly Thr Pro Pro Pro Gin 
125 130 135 

gag gca gaa gag ccg tct eta ctt aag aag etc tta ctg gca cca gee 544 
Glu Ala Glu Glu Pro Ser Leu Leu Lys Lys Leu Leu Leu Ala Pro Ala 
140 145 150 

aac act cag eta agt tat aat gaa tgc agt ggt etc agt acc cag aac 592 
Asn Thr Gin Leu Ser Tyr Asn Glu Cys Ser Gly Leu Ser Thr Gin Asn 
155 160 165 

cat gca aat cac aat cac agg ate aga aca aac cct gca att gtt aag 640 
His Ala Asn His Asn His Arg lie Arg Thr Asn Pro Ala He Val Lys 
170 175 180 

act gag aat tea tgg ag\p aat aaa gcg aag agt att tgt caa cag caa 688 
Thr Glu Asn Ser Trp Ser Asn Lys Ala Lys Ser He Cys Gin Gin Gin 
185 190 195 200 

aag cca caa aga cgt ccc tgc teg gag ctt etc aaa tat ctg acc aca 736 
Lys Pro Gin Arg Arg Pro Cys Ser Glu Leu Leu Lys Tyr Leu Thr Thr 
205 210 ' 215 

aac gat gac cct cct cac acc aaa ccc aca gag aac aga aac age age 784 
Asn Asp Asp Pro Pro His Thr Lys Pro Thr Glu Asn Arg Asn Ser Ser 
220 225 230 

aga gac aaa tgc acc tec aaa aag aag tec cac aca cag teg cag tea 832 
Arg Asp Lys Cys Thr Ser Lys Lys Lys Ser His Thr Gin Ser Gin Ser 
235 240 245 

caa cac tta caa gee aaa cca aca act tta tct ctt cct ctg acc cca 880 
Gin His Leu Gin Ala Lys Pro Thr Thr Leu Ser Leu Pro Leu Thr Pro 
250 255 260 

gag tea cca aat gac ccc aag ggt tec cca ttt gag aac aag act att 928 
Glu Ser Pro Asn Asp Pro Lys Gly Ser Pro Phe Glu Asn Lys Thr- lie 
265 270 275 ~ 280 

gaa cgc acc tta agt gtg gaa etc tct gga act gca ggc eta act cca 976 
Glu Arg Thr Leu Ser Val Glu Leu Ser Gly Thr Ala Gly Leu Thr Pro 
285 290 295 

ccc acc act cct cct cat aaa gee aac caa gat aac cct ttt agg get 1024 
Pro Thr Thr Pro Pro His Lys Ala Asn Gin Asp Asn Pro Phe Arg Ala 
300 305 310 

tct cca aag ctg aag tec tct tgc aag act gtg gtg cca cca cca tea 1072 
Ser Pro Lys Leu Lys Ser Ser Cys Lys Thr Val Val Pro Pro Pro Ser 
315 320 325 

aag aag ccc agg tac agt gag tct tct ggt aca caa ggc aat aac tec 1120 
Lys Lys Pro Arg Tyr Ser Glu Ser Ser Gly Thr Gin Gly Asn Asn Ser 
330 335 340 

acc aag aaa ggg ccg gag caa tec gag ttg tat gca caa etc age aag 1168 
Thr Lys Lys Gly Pro Glu Gin Ser Glu Leu Tyr Ala Gin Leu Ser Lys 
345 350 355 360 



WO 03/068944 



3 



PCT/US03/04792 



tec tea gtc etc act ggt gga cac gag gaa agg aag ace aag egg ccc 1216 
Ser Ser Val Leu Thr Gly Gly His Glu Glu Arg Lys Thr Lys Arg Pro 
365 " 370 375 

agt ctg egg ctg ttt ggt gac cat gac tat tgc cag tea att aat tec 1264 
Ser Leu Arg Leu Phe Gly Asp His Asp Tyr Cys Gin Ser lie Asn Ser 
380 385 390 

aaa acg gaa ata etc att aat ata tea cag gag etc caa gac tct aga 1312 
Lys Thr Glu lie Leu lie Asn lie Ser Gin Glu Leu Gin Asp Ser Arg 
395 400 405 

caa eta gaa aat aaa gat gtc tec tct gat tgg cag ggg cag att tgt 1360 
Gin Leu Glu Asn Lys Asp Val Ser Ser Asp Trp Gin Gly Gin He Cys 
410 415 420 

tct tec aca gat tea gac cag tgc tac ctg aga gag act ttg gag gca 1408 
Ser Ser Thr Asp Ser Asp Gin Cys Tyr Leu Arg Glu Thr Leu Glu Ala 
425 430 " 435 440 

age aag cag gtc tct cct tgc age aca aga aaa cag etc caa gac cag 1456 
Ser Lys Gin Val Ser Pro Cys Ser Thr Arg Lys Gin Leu Gin Asp Gin 
445 450 455 

gaa ate cga gee gag ctg aac aag cac ttc ggt cat ccc agt caa get 1504 
Glu He Arg Ala Glu Leu Asn Lys His Phe Gly His Pro Ser Gin Ala 
460 465 470 

gtt ttt gac gac gaa gca gac aag ace ggt gaa ctg agg gac agt gat 1552 
Val Phe Asp Asp Glu Ala Asp Lys Thr Gly Glu Leu Arg Asp Ser Asp 
475 480 485 

ttc agt aat gaa caa ttc tec aaa eta cct atg ttt ata aat tea gga 1600 
Phe Ser Asn Glu Gin Phe Ser Lys Leu Pro Met Phe He Asn Ser Gly 
490 495 500 

eta gec atg gat ggc ctg ttt gat gac age gaa gat aaa agt gat aaa 1648 
Leu Ala Met Asp Gly Leu Phe Asp Asp Ser Glu Asp Lys Ser Asp Lys 
505 510 515 520 

ctg age tac cct tgg gat ggc acg caa tec tat tea ttg ttc aat gtg 1696 
Leu Ser Tyr Pro Trp Asp Gly-' Thr Gin Ser Tyr Ser Leu Phe Asn Val 
525 530 535 

tct cct tct tgt tct tct ttt aac tct cca tgt aga gat tct gtg tea 1744 
Ser Pro Ser Cys Ser Ser Phe Asn Ser Pro Cys Arg Asp Ser Val Ser 
540 545 550 

cca ccc aaa tec tta ttt tct caa aga ccc caa agg atg cgc tct cgt 1792 
Pro Pro Lys Ser Leu Phe Ser Gin Arg Pro Gin Arg Met Arg Ser Arg 
555 560 565 

tea agg tec ttt tct cga cac agg teg tgt tec cga tea cca tat tec 1840 
Ser Arg Ser Phe Ser Arg His Arg Ser Cys Ser Arg Ser Pro Tyr Ser 
570 575 580 

agg tea aga tea agg tct cca ggc agt aga tec tct tea aga tec tgc 1888 
Arg Ser Arg Ser Arg Ser Pro Gly Ser Arg Ser Ser Ser Arg Ser Cys 
585 590 595 600 
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tat tac tat gag tea age cac tac aga cac cgc acg cac cga aat tct 1936 
Tyr Tyr Tyr Glu Ser Ser His Tyr Arg His Arg Thr His Arg Asn Ser 
605 610 615 

ccc ttg tat gtg aga tea cgt tea aga teg ccc tac age cgt egg ccc 1984 
Pro Leu Tyr Val Arg Ser Arg Ser Arg Ser Pro Tyr Ser Arg Arg Pro 
620 " 625 630 

agg tat gac age tac gag gaa tat cag cac gag agg ctg aag agg gaa 2032 
Arg Tyr Asp Ser Tyr Glu Glu Tyr Gin His Glu Arg Leu Lys Arg Glu 
635 ^ 640 645 

gaa tat cgc aga gag tat gag aag cga gag tct gag agg gec aag caa 2080 
Glu Tyr Arg Arg Glu Tyr Glu Lys Arg Glu Ser Glu Arg Ala Lys Gin 
650 655 660 

agg gag agg cag agg cag aag gca att gaa gag cgc cgt gtg att tat 2128 
Arg Glu Arg Gin Arg Gin Lys Ala lie Glu Glu Arg Arg Val lie Tyr 
665 670 675 680 

gtc ggt aaa ate aga cct gac aca aca egg aca gaa ctg agg gac cgt 2176 
Val Gly Lys lie Arg Pro Asp Thr Thr Arg Thr Glu Leu Arg Asp Arg 
685 690 695 

ttt gaa gtt ttt ggt gaa att gag gag tgc aca gta aat ctg egg gat 2224 
Phe Glu Val Phe Gly Glu lie Glu Glu Cys Thr Val Asn Leu Arg Asp 
700 705 710 

gat gga gac age tat ggt ttc att ace tac cgt tat ace tgt gat get 2272 
Asp Gly Asp Ser Tyr Gly Phe lie Thr Tyr Arg Tyr Thr Cys Asp Ala 
715 720 725 

ttt get get ctt gaa aat gga tac act ttg cgc agg tea aac gaa act 2320 
Phe Ala Ala Leu Glu Asn Gly Tyr Thr Leu Arg Arg Ser Asn Glu Thr 
730 735 740 

gac ttt gag ctg tac ttt tgt gga cgc aag caa ttt ttc aag tct aac 2368 
Asp Phe Glu Leu Tyr Phe Cys Gly Arg Lys Gin Phe Phe Lys Ser Asn 
745 750 755 760 

tat gca gac eta gat tea aac tea gat gac ttt gac cct_ get tec acc 2416 
Tyr Ala Asp Leu Asp Ser Asn Ser Asp Asp Phe Asp Pro Ala Ser Thr 
765 -770 775 

aag age aag tat gac tct ctg gat ttt gat agt tta ctg aaa gaa get 2464 
Lys Ser Lys Tyr Asp Ser Leu Asp Phe Asp Ser Leu Leu Lys Glu Ala 
780 785 790 

cag aga age ttg cgc agg taacatgttc cctagctgag gatgacagag 2512 
Gin Arg Ser Leu Arg Arg 
795 

ggatggcgaa tacctcatgg gaeagegegt ccttccctaa agactattgc aagtcatact 2572 

taggaatttc tcctacttta cactctctgt acaaaaacaa aacaaaacaa caacaataca 2632 

acaagaacaa caacaacaat aacaacaatg gtttacatga acacagctgc tgaagaggca 2692 

agagacagaa tgatatccag taagcacatg tttattcatg ggtgtcagct ttgettttec 2752 

tggagtctct tggtgatgga gtgtgcgtgt gtgcatgtat gtgtgtgtgt atgtatgtgt 2812 
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gtggtgtgtg tgcttggttt aggggaagta tgtgtgggta catgtgagga ctgggggcac 2872 

ctgaccagaa tgcgcaaggg caaaccattt caaatggcag cagttccatg aagacacact 2932 

taaaacctag aacttcaaaa tgttcgtatt ctattcaaaa ggaaaaatat atatatatat 2992 
atatatatat aaattaaaaa aaaaaaaaaa a 3023 



<210> 2 
<211> 798 
<212> PRT 

<213> Homo sapiens 
<400> 2 

Met Ala Trp Asp Met Cys Asn Gin Asp Ser Glu Ser Val Trp Ser Asp 
15 10 15 

lie Glu Cys Ala Ala Leu Val Gly Glu Asp Gin Pro Leu Cys Pro Asp 
20 25 30 

Leu Pro Glu Leu Asp Leu Ser Glu Leu Asp Val Asn Asp Leu Asp Thr 
35 40 45 

Asp Ser Phe Leu Gly Gly Leu Lys Trp Cys Ser Asp Gin Ser Glu lie 
50 55 ~ 60 

lie Ser Asn Gin Tyr Asn Asn Glu Pro Ser Asn lie Phe Glu Lys lie 
65 70 75 80 

Asp Glu Glu Asn Glu Ala Asn Leu Leu Ala Val Leu Thr Glu Thr Leu 
85 90 95 

Asp Ser Leu Pro Val Asp Glu Asp Gly Leu Pro Ser Phe Asp Ala Leu 
100 105 110 

Thr Asp Gly Asp Val Thr Thr Asp Asn Glu Ala Ser Pro Ser Ser Met 
115 120 125 

Pro Asp Gly Thr Pro Pro Pro Gin Glu Ala Glu Glu Pro Ser Leu Leu 
130 135 140 

Lys Lys Leu Leu Leu Ala Pro Ala Asn Thr Gin Leu Ser Tyr Asn Glu 
145 150 155 160 

Cys Ser Gly Leu Ser Thr Gin Asn His Ala Asn His Asn His Arg lie 
165 170 175 

Arg Thr Asn Pro Ala He Val Lys Thr Glu Asn Ser Trp Ser Asn Lys 
180 185 190 

Ala Lys Ser He Cys Gin Gin Gin Lys Pro Gin Arg Arg Pro Cys Ser 
195 200 205 

Glu Leu Leu Lys Tyr Leu Thr Thr Asn Asp Asp Pro Pro His Thr Lys 
210 215 220 

Pro Thr Glu Asn Arg Asn Ser Ser Arg Asp Lys Cys Thr Ser Lys Lys 
225 230 235 240 
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Lys Ser His Thr Gin Ser Gin Ser Gin His Leu Gin Ala Lys Pro Thr 
245 250 255 

Thr Leu Ser Leu Pro Leu Thr Pro Glu Ser Pro Asn Asp Pro Lys Gly 
260 265 270 

Ser Pro Phe Glu Asn Lys Thr lie Glu Arg Thr Leu Ser Val Glu Leu 
275 " 280 285 

Ser Gly Thr Ala Gly Leu Thr Pro Pro Thr Thr Pro Pro His Lys Ala 
290 295 300 

Asn Gin Asp Asn Pro Phe Arg Ala Ser Pro Lys Leu Lys Ser Ser Cys 
305 " 310 315 320 

Lys Thr Val Val Pro Pro Pro Ser Lys Lys Pro Arg Tyr Ser Glu Ser 
325 330 335 

Ser Gly Thr Gin Gly Asn Asn Ser Thr Lys Lys Gly Pro Glu Gin Ser 
340 345 350 

Glu Leu Tyr Ala Gin Leu Ser Lys Ser Ser Val Leu Thr Gly Gly His 
355 360 365 

Glu Glu Arg Lys Thr Lys Arg Pro Ser Leu Arg Leu Phe Gly Asp His 
370 * 375 380 

Asp Tyr Cys Gin Ser lie Asn Ser Lys Thr Glu lie Leu He Asn He 
385 390 395 400 

Ser Gin Glu Leu Gin Asp Ser Arg Gin Leu Glu Asn Lys Asp Val Ser 
405 410 415 

Ser Asp Trp Gin Gly Gin He Cys Ser Ser Thr Asp Ser Asp Gin Cys 
420 425 430 

Tyr Leu Arg Glu Thr Leu Glu Ala Ser Lys Gin Val Ser Pro Cys Ser 
435 440 ' 445 

Thr Arg Lys Gin Leu Gin Asp Gin Glu He Arg Ala Glu Leu Asn Lys 
450 455 ^ 460 

His Phe Gly His Pro Ser Gin Ala Val Phe Asp Asp Glu Ala Asp Lys 
465 470 475 480 

Thr Gly Glu Leu Arg Asp Ser Asp Phe Ser Asn Glu Gin Phe Ser Lys 
485 490 495 

Leu Pro Met Phe He Asn Ser Gly Leu Ala Met Asp Gly Leu Phe Asp 
500 505 510 

Asp Ser Glu Asp Lys Ser Asp Lys Leu Ser Tyr Pro Trp Asp Gly Thr 
515 520 525 

Gin Ser Tyr Ser Leu Phe Asn Val Ser Pro Ser Cys Ser Ser Phe Asn 
530 535 540 

Ser Pro Cys Arg Asp Ser Val Ser Pro Pro Lys Ser Leu Phe Ser Gin 
545 550 555 560 

Arg Pro Gin Arg Met Arg Ser Arg Ser Arg Ser Phe Ser Arg His Arg 
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565 570 575 

Ser Cys Ser Arg Ser Pro Tyr Ser Arg Ser Arg Ser Arg Ser Pro Gly 
580 585 590 

Ser Arg Ser Ser Ser Arg Ser Cys Tyr Tyr Tyr Glu Ser Ser His Tyr 
595 600 605 

Arg His Arg Thr His Arg Asn Ser Pro Leu Tyr Val Arg Ser Arg Ser 
610 615 620 

Arg Ser Pro Tyr Ser Arg Arg Pro Arg Tyr Asp Ser Tyr Glu Glu Tyr 
625 630 635 640 

Gin His Glu Arg Leu Lys Arg Glu Glu Tyr Arg Arg Glu Tyr Glu Lys 
645 650 655 

Arg Glu Ser Glu Arg Ala Lys Gin Arg Glu Arg Gin Arg Gin Lys Ala 
660 665 670 

He Glu Glu Arg Arg Val He Tyr Val Gly Lys He Arg Pro Asp Thr 
675 680 685 

Thr Arg Thr Glu Leu Arg Asp Arg Phe Glu Val Phe Gly Glu He Glu 
690 695 700 

Glu Cys Thr Val Asn Leu Arg Asp Asp Gly Asp Ser Tyr Gly Phe He 
705 710 715 720 

Thr Tyr Arg Tyr Thr Cys Asp Ala Phe Ala Ala Leu Glu Asn Gly Tyr 
725 " 730 735 

Thr Leu Arg Arg Ser Asn Glu Thr Asp Phe Glu Leu Tyr Phe Cys Gly 
740 745 750 

Arg Lys Gin Phe Phe Lys Ser Asn Tyr Ala Asp Leu Asp Ser Asn Ser 
755 760 765 

Asp Asp Phe Asp Pro Ala Ser Thr Lys -Ser Lys Tyr Asp Ser Leu Asp 
770 775 780 

Phe Asp Ser Leu Leu Lys Glu Ala Gin Arg Ser Leu Arg Arg 
785 790 795 



<210> 3 
<211> 5 
<212> PRT 

<213> Mus mus cuius 
<220> 

<221> VARIANT 
<222> 2, 3 

<223> Xaa = Any Amino Acid 
<400> 3 

Leu Xaa Xaa Leu Leu 
1 5 



<210> 4 
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<211> 3066 

<212> DNA 

<213> Mus musculus 

<220> 

<221> CDS 

<222> (92) . . (2482) 

<400> 4 

aattcggcac gaggttgcct gcatgagtgt gtgctgtgtg tcagagtgga ttggagttga 60 

aaaagcttga ctggcgtcat tcgggagctg g atg get tgg gac atg tgc age 112 

Met Ala Trp Asp Met Cys Ser 
1 5 

caa gac tct gta tgg agt gac ata gag tgt get get ctg gtt ggt gag 160 
Gin Asp Ser Val Trp Ser Asp He Glu Cys Ala Ala Leu Val Gly Glu 
10 15 20 

gac cag cct ctt tgc cca gat ctt cct gaa ctt gac ctt tct gaa ctt 208 
Asp Gin Pro Leu Cys Pro Asp Leu Pro Glu Leu Asp Leu Ser Glu Leu 
25 ~ 30 35 

gat gtg aat gac ttg gat aca gac age ttt ctg ggt gga ttg aag tgg 256 
Asp Val Asn Asp Leu Asp Thr Asp Ser Phe Leu Gly Gly Leu Lys Trp 
40 45 50 55 

tgt age gac caa teg gaa ate ata tec aac cag tac aac aat gag cct 304 
Cys Ser Asp Gin Ser Glu He He Ser Asn Gin Tyr Asn Asn Glu Pro 
60 65 70 

gcg aac ata ttt gag aag ata gat gaa gag aat gag gca aac ttg eta 352 
Ala Asn He Phe Glu Lys He Asp Glu Glu Asn Glu Ala Asn Leu Leu 
75 80 85 

gcg gtc etc aca gag aca ctg gac agt etc ccc gtg gat gaa gac gga 400 
Ala Val Leu Thr Glu Thr Leu Asp Ser Leu Pro Val Asp Glu Asp Gly 
90 95 100 

ttg ccc tea ttt gat gca ctg aca gat gga gee gtg acc act gac aac 448 
Leu Pro Ser Phe Asp Ala Leu Thr Asp Gly Ala Val Thr Thr Asp Asn 
105 110 ' 115 

gag gee agt cct tec tec atg cct gac ggc acc cct ccc cct cag gag 496 
Glu Ala Ser Pro Ser Ser Met Pro Asp Gly Thr Pro Pro Pro Gin Glu 
120 125 130 135 

gca gaa gag ccg tct eta ctt aag aag etc tta ctg gca cca gee aac 544 
Ala Glu Glu Pro Ser Leu Leu Lys Lys Leu Leu Leu Ala Pro Ala Asn 
140 " 145 150 

act cag etc age tac aat gaa tgc age ggt ctt age act cag aac cat 592 
Thr Gin Leu Ser Tyr Asn Glu Cys Ser Gly Leu Ser Thr Gin Asn His 
155 160 165 

gca gca aac cac acc cac agg ate aga aca aac cct gee att gtt aag 640 
Ala Ala Asn His Thr His Arg He Arg Thr Asn Pro Ala He Val Lys 
170 175 180 

acc gag aat tea tgg age aat aaa gcg aag age att tgt caa cag caa 688 
Thr Glu Asn Ser Trp Ser Asn Lys Ala Lys Ser He Cys Gin Gin Gin 
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185 190 195 

aag cca caa aga cgt ccc tgc tea gag ctt etc aag tat ctg acc aca 736 
Lys Pro Gin Arg Arg Pro Cys Ser Glu Leu Leu Lys Tyr Leu Thr Thr 
200 205 210 215 

aac gat gac cct cct cac acc aaa ccc aca gaa aac agg aac age age 784 
Asn Asp Asp Pro Pro His Thr Lys Pro Thr Glu Asn Arg Asn Ser Ser 
220 225 230 

aga gac aaa tgt get tec aaa aag aag tec cat aca caa ccg cag teg 832 
Arg Asp Lys Cys Ala Ser Lys Lys Lys Ser His Thr Gin Pro Gin Ser 
235 ~ 240 245 

caa cat get caa gee aaa cca aca act tta tct ctt cct ctg acc cca 880 
Gin His Ala Gin Ala Lys Pro Thr Thr Leu Ser Leu Pro Leu Thr Pro 
250 255 260 

gag tea cca aat gac ccc aag ggt tec cca ttt gag aac aag act att 928 
Glu Ser Pro Asn Asp Pro Lys Gly Ser Pro Phe Glu Asn Lys Thr He 
265 270 275 

gag cga acc tta agt gtg gaa etc tct gga act gca ggc eta act cct 97 6 
Glu Arg Thr Leu Ser Val Glu Leu Ser Gly Thr Ala Gly Leu Thr Pro 
280 " 285 290 295 

ccc aca act cct cct cat aaa gee aac caa gat aac cct ttc aag get 1024 ■ 
Pro Thr Thr Pro Pro His Lys Ala Asn Gin Asp Asn Pro Phe Lys Ala 
300 305 310 

teg cca aag ctg aag ccc tct tgc aag acc gtg gtg cca ccg cca acc 1072 
Ser Pro Lys Leu Lys Pro Ser Cys Lys Thr Val Val Pro Pro Pro Thr 
315 320 325 

aag agg gee egg tac agt gag tgt tct ggt acc caa ggc age cac tec 1120' 
Lys Arg Ala Arg Tyr Ser Glu Cys Ser Gly Thr Gin Gly Ser His Ser 
330 335 340 

acc aag aaa ggg ccc gag caa tct gag ttg tac gca caa etc age aag 1168 
Thr Lys Lys Gly Pro Glu Gin Ser Glu Leu Tyr Ala Gin Leu Ser Lys 
345 350 355 

tec tea ggg etc age cga gga cac gag gaa agg aag act aaa egg ccc 1216 
Ser Ser Gly Leu Ser Arg Gly His Glu Glu Arg Lys Thr Lys Arg Pro 
360 365 370 375 

agt etc egg ctg ttt ggt gac cat gac tac tgt cag tea etc aat tec 1264 
Ser Leu Arg Leu Phe Gly Asp His Asp Tyr Cys Gin Ser Leu Asn Ser 
380 385 390 

aaa acg gat ata etc att aac ata tea cag gag etc caa gac tct aga 1312 
Lys Thr Asp He Leu He Asn He Ser Gin Glu Leu Gin Asp Ser Arg 
395 400 405 

caa eta gac ttc aaa gat gee tec tgt gac tgg cag ggg cac ate tgt 1360 
Gin Leu Asp Phe Lys Asp Ala Ser Cys Asp Trp Gin Gly His He Cys 
410 ~ 415 420 

tct tec aca gat tea ggc cag tgc tac ctg aga gag act ttg gag gee 1408 
Ser Ser Thr Asp Ser Gly Gin Cys Tyr Leu Arg Glu Thr Leu Glu Ala 
425 430 435 
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age aag cag gtc tct cct tgc age acc aga aaa cag etc caa gac cag 1456 

Ser Lys Gin Val Ser Pro Cys Ser Thr Arg Lys Gin Leu Gin Asp Gin 
440 445 450 455 

gaa ate cga gcg gag ctg aac aag cac ttc ggt cat ccc tgt caa get 1504 
Glu He Arg Ala Glu Leu Asn Lys His Phe Gly His Pro Cys Gin Ala 
460 465 470 

gtg ttt gac gac aaa tea gac aag acc agt gaa eta agg gat ggc gac 1552 
Val Phe Asp Asp Lys Ser Asp Lys Thr Ser Glu Leu Arg Asp Gly Asp 
475 480 485 

ttc agt aat gaa caa ttc tec aaa eta cct gtg ttt ata aat tea gga 1600 
Phe Ser Asn Glu Gin Phe Ser Lys Leu Pro Val Phe He Asn Ser Gly 
490 495 500 

eta gec atg gat ggc eta ttt gat gac agt gaa gat gaa agt gat aaa 1648 
Leu Ala Met Asp Gly Leu Phe Asp Asp Ser Glu Asp Glu Ser Asp Lys 
505 ~ 510 515 

ctg age tac cct tgg gat ggc acg cag ccc tat tea ttg ttc gat gtg 1696 
Leu Ser Tyr Pro Trp Asp Gly Thr Gin Pro Tyr Ser Leu Phe Asp Val 
520 525 530 535 

teg cct tct tgc tct tec ttt aac tct ccg tgt cga gac tea gtg tea 1744 
Ser Pro Ser Cys Ser Ser Phe Asn Ser Pro Cys Arg Asp Ser Val Ser 
540 545 550 

cca ccg aaa tec tta ttt tct caa aga ccc caa agg atg cgc tct cgt 1792 
Pro Pro Lys Ser Leu Phe Ser Gin Arg Pro Gin Arg Met Arg Ser Arg 
555 560 565 

tea aga tec ttt tct cga cac agg teg tgt tec cga tea cca tat tec 1840 
Ser Arg Ser Phe Ser Arg His Arg Ser Cys Ser Arg Ser Pro Tyr Ser 
570 575 580 

agg tea aga tea agg tec cca ggc agt aga tec tct tea aga tec tgt 1888 
Arg Ser Arg Ser Arg Ser Pro Gly Ser Arg Ser Ser Ser Arg Ser Cys 
585 590 595 

tac tac tat gaa tea age cac tac aga cac cgc aca cac cgc aat tct 1936 
Tyr Tyr Tyr Glu Ser Ser His Tyr Arg His Arg Thr His Arg Asn Ser 
600 605 610 615 

ccc ttg tat gtg aga tea cgt tea agg tea ccc tac age cgt agg ccc 1984 
Pro Leu Tyr Val Arg Ser Arg Ser Arg Ser Pro Tyr Ser Arg Arg Pro 
620 625 630 

agg tac gac age tat gaa gee tat gag cac gaa agg etc aag agg gat 2032 
Arg Tyr Asp Ser Tyr Glu Ala Tyr Glu His Glu Arg Leu Lys Arg Asp 
635 640 645 

gaa tac cgc aaa gag cac gag aag egg gag tct gaa agg gee aaa cag 2080 
Glu Tyr Arg Lys Glu His Glu Lys Arg Glu Ser Glu Arg Ala Lys Gin 
650 ~ 655 660 

aga gag agg cag aag cag aaa gca att gaa gag cgc cgt gtg att tac 2128 
Arg Glu Arg Gin Lys Gin Lys Ala He Glu Glu Arg Arg Val He Tyr 
665 670 675 
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gtt ggt aaa ate aga cct gac aca acg egg aca gaa ttg aga gac cgc 2176 
Val Gly Lys lie Arg Pro Asp Thr Thr Arg Thr Glu Leu Arg Asp Arg 
680 685 690 695 

ttt gaa gtt ttt ggt gaa att gag gaa tgc acc gta aat ctg egg gat 2224 
Phe Glu Val Phe Gly Glu lie Glu Glu Cys Thr Val Asn Leu Arg Asp 
700 705 710 

gat gga gac age tat ggt ttc ate acc tac cgt tac acc tgt gac get 2272 
Asp Gly Asp Ser Tyr Gly Phe lie Thr Tyr Arg Tyr Thr Cys Asp Ala 
715 720 725 

ttc get get ctt gag aat gga tat act tta cgc agg teg aac gaa act 2320 
Phe Ala Ala Leu Glu Asn Gly Tyr Thr Leu Arg Arg Ser Asn Glu Thr 
730 735 740 

gac ttc gag ctg tac ttt tgt gga egg aag caa ttt ttc aag tct aac 2368 
Asp Phe Glu Leu Tyr Phe Cys Gly Arg Lys Gin Phe Phe Lys Ser Asn 
745 750 755 

tat gca gac eta gat acc aac tea gac gat ttt gac cct get tec acc 2416 
Tyr Ala Asp Leu Asp Thr Asn Ser Asp Asp Phe Asp Pro Ala Ser Thr 
760 765 " 770 775 

aag age aag tat gac tct ctg gat ttt gat agt tta ctg aag gaa get 2464 
Lys Ser Lys Tyr Asp Ser Leu Asp Phe Asp Ser Leu Leu Lys Glu Ala 
780 785 790 



cag aga age ttg cgc agg taacgtgttc ccaggctgag gaatgacaga 2512 
Gin Arg Ser Leu Arg Arg 
795 



gagatggtca 


atacctcatg 


ggacagcgtg 


tcctttccca agactcttgc 


aagtcatact 2572 


taggaatttc 


tcctacttta 


cactctctgt 


acaaaaataa aacaaaacaa 


aacaacaata 2632 


acaacaacaa 


caacaacaat 


aacaacaaca 


accataccag aacaagaaca 


aeggtttaca 2692 


tgaacacagc 


tgctgaagag 


gcaagagaca 


gaatgataat ccagtaagca 


caegtttatt 2752 


cacgggtgtc 


agetttgett 


tccctggagg 


ctcttggtga cagtgtgtgt 


gcgtgtgtgt 2812 


gtgtgggtgt 


gcgtgtgtgt 


atgtgtgtgt 


gtgtacttgt ttggaaagta 


catatgtaca 2872 


catgtgagga 


cttgggggca 


cctgaacaga 


acgaacaagg gcgacccctt 


caaatggcag 2932 


catttccatg 


aagacacact 


taaaacctac 


aacttcaaaa tgttcgtatt 


ctatacaaaa 2992 


ggaaaataaa 


taaatataaa 


aaaaaaaaaa 


aaaaaactcg agagatctat 


gaategtaga 3052 


tactgaaaaa 


cccc 






"3066 



<210> 5 
<211> 797 
<212> PRT 

<213> Mus musculus 



<400> 5 

Met Ala Trp Asp Met Cys Ser Gin Asp Ser Val Trp Ser Asp lie Glu 
15 10 15 
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Cys Ala Ala Leu Val Gly Glu Asp Gin Pro Leu Cys Pro Asp Leu Pro 
20 25 30 

Glu Leu Asp Leu Ser Glu Leu Asp Val Asn Asp Leu Asp Thr Asp Ser 
35 40 45 

Phe Leu Gly Gly Leu Lys Trp Cys Ser Asp Gin Ser Glu lie lie Ser 
50 55 60 

Asn Gin Tyr Asn Asn Glu Pro Ala Asn lie Phe Glu Lys lie Asp Glu 
65 70 75 80 

Glu Asn Glu Ala Asn Leu Leu Ala Val Leu Thr Glu Thr Leu Asp Ser 
85 90 95 

Leu Pro Val Asp Glu Asp Gly Leu Pro Ser Phe Asp Ala Leu Thr Asp 
100 105 110 

Gly Ala Val Thr Thr Asp Asn Glu Ala Ser Pro Ser Ser Met Pro Asp 
115 120 125 

Gly Thr Pro Pro Pro Gin Glu Ala Glu Glu Pro Ser Leu Leu Lys Lys 
130 135 140 

Leu Leu Leu Ala Pro Ala Asn Thr Gin Leu Ser Tyr Asn Glu Cys Ser 
145 150 155 160 

Gly Leu Ser Thr Gin Asn His Ala Ala Asn His Thr His Arg lie Arg 
165 170 175 

Thr Asn Pro Ala lie Val Lys Thr Glu Asn Ser Trp Ser Asn Lys Ala 
180 185 190 

Lys Ser lie Cys Gin Gin Gin Lys Pro Gin Arg Arg Pro Cys Ser Glu 
195 200 205 

Leu Leu Lys Tyr Leu Thr Thr Asn Asp Asp Pro Pro His Thr Lys Pro 
210 215 220 

Thr Glu Asn Arg Asn Ser Ser Arg Asp Lys Cys Ala Ser Lys Lys Lys 
225 230 * 235 240 

Ser His Thr Gin Pro Gin Ser Gin His Ala Gin Ala Lys Pro Thr Thr 
245 250 255 

Leu Ser Leu Pro Leu Thr Pro Glu Ser Pro Asn Asp Pro Lys Gly Ser 
260 265 270 

Pro Phe Glu Asn Lys Thr lie Glu Arg Thr Leu Ser Val Glu Leu Ser 
275 ~ 280 285 

Gly Thr Ala Gly Leu Thr Pro Pro Thr Thr Pro Pro His Lys Ala Asn 
290 295 300 

Gin Asp Asn Pro Phe Lys Ala Ser Pro Lys Leu Lys Pro Ser Cys Lys 
305 310 315 320 



Thr Val Val Pro Pro Pro Thr Lys Arg Ala Arg Tyr Ser Glu Cys Ser 
325 330 335 
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Gly Thr Gin Gly Ser His Ser Thr Lys Lys Gly Pro Glu Gin Ser Glu 
340 345 350 

Leu Tyr Ala Gin Leu Ser Lys Ser Ser Gly Leu Ser Arg Gly His Glu 
355 " 360 365 

Glu Arg Lys Thr Lys Arg Pro Ser Leu Arg Leu Phe Gly Asp His Asp 
370 375 380 

Tyr Cys Gin Ser Leu Asn Ser Lys Thr Asp lie Leu lie Asn He Ser 
385 390 " 395 400 

Gin Glu Leu Gin Asp Ser Arg Gin Leu Asp Phe Lys Asp Ala Ser Cys 
405 410 415 

Asp Trp Gin Gly His He Cys Ser Ser Thr Asp Ser Gly Gin Cys Tyr 
420 425 430 

Leu Arg Glu Thr Leu Glu Ala Ser Lys Gin Val Ser Pro Cys Ser Thr 
435 440 445 

Arg Lys Gin Leu Gin Asp Gin Glu He Arg Ala Glu Leu Asn Lys His 
450 455 460 

Phe Gly His Pro Cys Gin Ala Val Phe Asp Asp Lys Ser Asp Lys Thr 
465 470 475 480 

Ser Glu Leu Arg Asp Gly Asp Phe Ser Asn Glu Gin Phe Ser Lys Leu 
485 490 495 

Pro Val Phe He Asn Ser Gly Leu Ala Met Asp Gly Leu Phe Asp Asp 
500 505 510 

Ser Glu Asp Glu Ser Asp Lys Leu Ser Tyr Pro Trp Asp Gly Thr Gin 
515 520 525 

Pro Tyr Ser Leu Phe Asp Val Ser Pro Ser Cys Ser Ser Phe Asn Ser 
530 535 540 

Pro Cys Arg Asp Ser Val Ser Pro Pro Lys Ser Leu Phe Ser Gin Arg 
545 * 550 555 560 

Pro Gin Arg Met Arg Ser Arg Ser Arg Ser Phe Ser Arg His Arg Ser 
565 570 575 

Cys Ser Arg Ser Pro Tyr Ser Arg Ser Ar£ Ser Arg Ser Pro Gly Ser 
580 ' 585 590 

Arg Ser Ser Ser Arg Ser Cys Tyr Tyr Tyr Glu Ser Ser His Tyr Arg 
595 600 605 

His Arg Thr His Arg Asn Ser Pro Leu Tyr Val Arg Ser Arg Ser Arg 
610 615 620 

Ser Pro Tyr Ser Arg Arg Pro Arg Tyr Asp Ser Tyr Glu Ala Tyr Glu 
625 ^ 630 635 640 

His Glu Arg Leu Lys Arg Asp Glu Tyr Arg Lys Glu His Glu Lys Arg 
645 650 655 

Glu Ser Glu Arg Ala Lys Gin Arg Glu Arg Gin Lys Gin Lys Ala He 
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660 665 670 

Glu Glu Arg Arg Val lie Tyr Val Gly Lys He Arg Pro Asp Thr Thr 
675 680 685 

Arg Thr Glu Leu Arg Asp Arg Phe Glu Val Phe Gly Glu He Glu Glu 

690 695 700 

Cys Thr Val Asn Leu Arg Asp Asp Gly Asp Ser Tyr Gly Phe He Thr 
705 710 715 720 

Tyr Arg Tyr Thr Cys Asp Ala Phe Ala Ala Leu Glu Asn Gly Tyr Thr 
725 730 735 

Leu Arg Arg Ser Asn Glu Thr Asp Phe Glu Leu Tyr Phe Cys Gly Arg 
740 745 750 

Lys Gin Phe Phe Lys Ser Asn Tyr Ala Asp Leu Asp Thr Asn Ser Asp 
755 760 765 

Asp Phe Asp Pro Ala Ser Thr Lys Ser Lys Tyr Asp Ser Leu Asp Phe 

770 775 780 

Asp Ser Leu Leu Lys Glu Ala Gin Arg Ser Leu Arg Arg 
785 790 795 

<210> 6 
<211> 1893 
<212> DNA 

<213> Mus musculus 
<400> 6 

gaattcggca cgaggcctgc atgagtgtgt gctgtgtgtc agagtggatt ggagttgaaa 60 

aagcttgact ggcgtcattc gggagctgga tggcttggga catgtgcagc caagactctg 120 

tatggagtga catagagtgt gctgctctgg ttggtgagga ccagcctctt tgcccagatc 180 

ttcctgaact tgacctttct gaacttgatg tgaatgactt ggatacagac agctttctgg 240 

gtggattgaa gtggtgtagc gaccaatcgg aaatcatatc caaccagtac aacaatgagc 300 

ctgcgaacat atttgagaag atagatgaag agaatgaggc gaacttgcta gcggtcctca 360 

cagagacact ggacagtctc cccgtggatg aagacggatt gccctcattt gatgcactga 420 

cagatggagc cgtgaccact gacaacgagg ccagtccttc ctccatgcct gacggcaccc 480 

ctccccctca ggaggcagaa gagccgtctc tacttaagaa gctcttactg gcaccagcca 540 

acactcagct cagctacaat gaatgcagcg gtcttagcac tcagaaccat gcagcaaacc 600 

acacccacag gatcagaaca aaccctgcca ttgttaagac cgagaattca tggagcaata 660 

aagcgaagag catttgtcaa cagcaaaagc cacaaagacg tccctgctca gagcttctca 720 

agtatctgac cacaaacgat gaccctcctc acaccaaacc cacagaaaac aggaacagca 780 

gcagagacaa atgtgcttcc aaaaagaagt cccatacaca accgcagtcg caacatgctc 840 

aagccaaacc aacaacttta tctcttcctc tgaccccaga gtcaccaaat gaccccaagg 900 

gttccccatt tgagaacaag actattgagc gaaccttaag tgtggaactc tctggaactg 960 

cagctccact agtgccaagg gagcatccat gcatcattac atccaggtcg atattgaatg 1020 

tcttcatgca aagatgtctt tctaatttat aaatatgaac acatcacaca acttgtgttc 1080 

attctattaa aggtgtaaaa actaatttga tttcaaaata gctgttgtta gtaaagcaag 1140 

atgagagaaa ggagaatgtt cttgtggcag aaggcattta aatctattgc atatggagat 1200 

tttttttcag acactaccaa caggatttta tgtctgaaat ggaaatggaa aggcaatgtc 1260 

agcctaacaa ggtgatggct tgaaacacaa gacatgaagg aactttgtta gggaccaaaa 1320 

taactggtcc ccaattttat gtatatacat acatgttttg gctatcacta taaacatggt 1380 

gaaagcaatg gagctgtttt ataactgata aaaagatgaa tagaacaaaa taaccagctg 1440 

tctttttact ctcggaccac tgggttctgc ccatatttcc ttccattcac atatctttgg 1500 

ttaccttgtt tgaaatgggg tagacatgcg gttaatttgg tttgttatta tattatttgt 1560 

ttgaggattt cataaataag tgcaatatat ttgcatcatt tccaccccaa cacctcccaa 1620 

aaccacccat ctcaaattca tttactcttt ttctataatt gtttttgtca tatattacac 1680 

acacacaaag gcgcatacac acacacgcac acacaggcac acacacacac acacacacac 1740 
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acacacacac acacacacac tgagagttgc cctaatttag ggttgaccac ttagggttca 1800 
ggtctcatcc ctgaaaaatg aagaagaaga agaagaagaa gaagaagaag aagaagaaga 1860 
agaagaagaa gaagaagaaa aaaaaaaaaa aaa 1893 

<210> 7 
<211> 320 

<212> PRT | 
<213> Mus musculus 

<400> 7 

Met Ala Trp Asp Met Cys Ser Gin Asp Ser Val Trp Ser Asp He Glu 

15 10 15 

Cys Ala Ala Leu Val Gly Glu Asp Gin Pro Leu Cys Pro Asp Leu Pro 

20 25 30 

Glu Leu Asp Leu Ser Glu Leu Asp Val Asn Asp Leu Asp Thr Asp Ser 

35 40 45 

Phe Leu Gly Gly Leu Lys Trp Cys Ser Asp Gin Ser Glu He He Ser 

50 55 60 

Asn Gin Tyr Asn Asn Glu Pro Ala Asn He Phe Glu Lys He Asp Glu 
65 70 75 80 

Glu Asn Glu Ala Asn Leu Leu Ala Val Leu Thr Glu Thr Leu Asp Ser 

85 90 95 

Leu Pro Val Asp Glu Asp Gly Leu Pro Ser Phe Asp Ala Leu Thr Asp. 

100 105 110 

Gly Ala Val Thr Thr Asp Asn Glu Ala Ser Pro Ser Ser Met Pro Asp 

115 120 125 

Gly Thr Pro Pro Pro Gin Glu Ala Glu Glu Pro Ser Leu Leu Lys Lys 

130 135 140 

Leu Leu Leu Ala Pro Ala Asn Thr Gin Leu Ser Tyr Asn Glu Cys Ser 
145 150 155 160 

Gly Leu Ser Thr Gin Asn His Ala Ala Asn His Thr His Arg He Arg 

165 170 175 

Thr Asn Pro Ala He Val Lys Thr Glu Asn Ser Trp Ser Asn Lys Ala 

180 185 190 

Lys Ser He Cys Gin Gin Gin Lys Pro Gin Arg Arg Pro Cys Ser Glu 

195 200 205 

Leu Leu Lys Tyr Leu Thr Thr Asn Asp Asp Pro Pro His Thr Lys Pro 

210 215 220 

Thr Glu Asn Arg Asn Ser Ser Arg Asp Lys Cys Ala Ser Lys Lys Lys 
225 230 235 240 

Ser His Thr Gin Pro Gin Ser Gin His Ala Gin Ala Lys Pro Thr Thr 

245 250 255 

Leu Ser Leu Pro Leu Thr Pro Glu Ser Pro Asn Asp Pro Lys Gly Ser 

260 265 270 

Pro Phe Glu Asn Lys Thr He Glu Arg Thr Leu Ser Val Glu Leu Ser 

275 280 285 

Gly Thr Ala Ala Pro Leu Val Pro Arg Glu His Pro Cys He He Thr 

290 295 300 

Ser Arg Ser He Leu Asn Val Phe Met Gin Arg Cys Leu Ser Asn Leu 
305 310 315 320 



<210> 8 
<211> 1744 
<212> DNA 

<213> Mus musculus 
<220> 

<221> misc_feature 
<222> 1543 

<223> n = A,T,C or G 
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<400> 8 

gaattcggca cgaggtcaga gtggattgga gttgaaaaag cttgactggc gtcattcggg 60 

agctggatgg cttgggacat gtgcagccaa gactctgtat ggagtgacat agagtgtgct 120 

gctctggttg gtgaggacca gcctctttgc ccagatcttc ctgaacttga cctttctgaa 180 

cttgatgtga atgacttgga tacagacagc tttctgggtg gattgaagtg gtgtagcgac 240 

caatcggaaa tcatatccaa ccagtacaac aatgagcctg cgaacatatt tgagaagata 300 

gatgaagaga atgaggcaaa cttgctagcg gtcctcacag agacactgga cagtctcccc 360' 

gtggatgaag acggattgcc ctcatttgat gcactgacag atggagccgt gaccactgac 420 

aacgaggcca gtccttcctc catgcctgac ggcacccctc cccctcagga ggcagaagag 480 

ccgtctctac ttaagaagct cttactggca ccagccaaca ctcagctcag ctacaatgaa 540 

tgcagcggtc ttagcactca gaaccatgca gcaaaccaca cccacaggat cagaacaaac 600 j 

cctgccattg ttaagaccga gaattcatgg agcaataaag cgaagagcat ttgtcaacag 660 

caaaagccac aaagacgtcc ctgctcagag cttctcaagt atctgaccac aaacgatgac 720 

cctcctcaca ccaaacccac agaaaacagg aacagcagca gagacaaatg tgcttccaaa 780 

aagaagtccc atacacaacc gcagtcgcaa catgctcaag ccaaaccaac aactttatct 840 

cttcctctga ccccagagtc accaaatgac cccaagggtt ccccatttga gaacaagact 900 

attgagcgaa ccttaagtgt ggaactctct ggaactgcag gtgtaaaaac taatttgatt 960 

tcaaaatagc tgttgttagt taagcaagat gagagaaagg agaatgttct tgtggcagaa 1020 

ggcatttaaa tctattgcat atggagattt tttttcagac actaccaaca ggattttatg 1080 

tctgaaatgg aaatggaaag gcaatgtcag cctaacaagg tgatggcttg aaacacaaga 1140 

catgaaggaa ctttgttagg gaccaaaata actggtcccc aattttatgt atatacatac 1200 

atgttttggc tatcactata aacatggtga aagcaatgga gctgttttat aactgataaa 1260 

aagatgaata gaacaaaata accagctgtc tttttactct cggaccactg ggttctgccc 1320 

atatttcctt ccattcacat atctttggtt accttgtttg aaatggggta gacatgcggt 1380 

taatttggtt tgttattata ttatttgttt gaggatttca taaataagtg caatatattt 1440 

gcatcatttc caccccaaca cctcccaaaa ccacccatct caaattcatt tactcttttt 1500 

ctataattgt ttttgtcata tattacacac acacaaaggc acntacacac acacgcacac 1560 

acaggcacac acacacacac acacacacac acacacacac acacacactg agaattgccc 1620 

taatttaggg ttgaccactt agggttcagt ttttttccct ggaaaatggg ggggggggaa 1680 

aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa 1740 

aaaa 1744 

<210> 9 
<211> 300 
<212> PRT 

<213> Mus musculus 
<400> 9 



Met 


Ala 


Trp 


Asp 


Met 


Cys 


Ser 


Gin 


Asp 


Ser 


Val 


Trp 


Ser ,Asp 


He 


Glu 


1 








5 










10 










15 




Cys 


Ala 


Ala 


Leu 


Val 


Gly 


Glu 


Asp 


Gin 


Pro 


Leu 


Cys 


Pro Asp 


Leu 


Pro 








20 










25 










30 






Glu 


Leu 


Asp 
35 


Leu 


Ser 


Glu 


Leu 


Asp 
40 


Val 


Asn 


Asp 


Leu 


Asp 
45 


Thr 


Asp 


Ser 


Phe 


Leu 
50 


Gly 


Gly 


Leu 


Lys 


Trp 
55 


Cys 


Ser 


Asp 


Gin 


Ser 
60 


Glu 


He 


He 


Ser 


Asn 


Gin 


Tyr 


Asn 


Asn 


Glu 


Pro 


Ala 


Asn 


lie 


Phe 


Glu 


Lys 


He 


Asp 


Glu 


65 










70 










75 










80 


Glu 


Asn 


Glu 


Ala 


Asn 
85 


Leu 


Leu 


Ala 


Val 


Leu 
90 


Thr 


Glu 


Thr 


Leu 


Asp 
95 


Ser 


Leu 


Pro 


Val 


Asp 


Glu 


Asp 


Gly 


Leu 


Pro 


Ser 


Phe 


Asp 


Ala 


Leu 


Thr 


Asp 








100 








105 










110 






Gly 


Ala 


Val 
115 


Thr 


Thr 


Asp 


Asn 


Glu 
120 


Ala 


Ser 


Pro 


Ser 


Ser 
125 


Met 


Pro 


Asp 


Gly 


Thr 
130 


Pro 


Pro 


Pro 


Gin 


Glu 
135 


Ala 


Glu 


Glu 


Pro 


Ser 
140 


Leu 


Leu 


Lys 


Lys 


Leu 


Leu 


Leu 
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Pro 
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155 
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Thr 


Gin 
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His 


Ala 


Ala 


Asn 


His 


Thr 


His Arg 


He 


Arg 



165 170 175 
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ion 
180 
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i art 
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Lys Ser 
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Gin Gin 


Lys 


Pro^ ? Gin 


Arg Arg 


Pro 
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Ser 


pi 

GlU 




195 
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Leu Leu 
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Thr Thr 


Asn 


Asp Asp 


Pro Pro 


His 


xnr 


Lys 


Pro 


210 








O 1 c 






zzU 










Thr Glu 


Asn 


Arg 


Asn 


Ser Ser 


Arg 


Asp Lys 


Cys Ala 


ber 


Lys 


Lys 


Lys 










230 






o *a a 

ZOO 










Ser His 


Thr 


Gin 


Pro 


Gin Ser 


Gin 


His Ala 


Gin Ala 


Lys 


Pro 


Thr 


Thr 








245 






250 








255 




Leu Ser 


Leu 


Pro 


Leu 


Thr Pro 


Glu 


Ser Pro 


Asn Asp 


Pro 


Lys 


Gly 


Ser 






260 








265 






270 






Pro Phe 


Glu 


Asn 


Lys 


Thr He 


Glu 


Arg Thr 


Leu Ser 


Val 


Glu 


Leu 


Ser 




275 








280 






285 








Gly Thr 


Ala 


Gly 


Val Lys Thr 


Asn 


Leu He 


Ser Lys 










290 








295 






300 











<210> 10 

<211> 3030 

<212> DNA 

<213> Homo sapiens 

<400> 10 

atgcctcctg tgtatgeetc tgagtatgtc ttgccactcc agggtggagg- gtceggggag 60 
gagcaactct atgetgaett tccagaactc gacctctccc agctggatgc cagegacttt 120 
gactcggcca ectgetttgg ggagctgcag tggtgcccag agaactcaga gactgaaccc 180 
aaccagtaca gccccgatga ctccgagctc ttccagattg acagtgagaa tgaggccctc 240 
ctggcagagc tcaccaagac cctggatgac atccctgaag atgacgtggg tctggctgcc 300 
ttcccagccc tggatggtgg agaegctcta tcatgcacct cagcttcgcc tgccccctca 360 
tctgcacccc ccagccctgc cccggagaag ccctcggccc cagcccctga ggtggacgag 420 
ctctcactgc tgeagaaget cctcctggcc acatcctacc caacatcaag ctctgacacc 4 80 
cagaaggeag g'gaccgcctg gcgccaggca ggectcagat ctaaaagtca aeggecttgt 540 • 
gttaaggegg acagcaccca agacaagaag gctcccatga tgeagtctea gagecgaagt 600 
tgtacagaac tacataagca cctcacctcg geacagtget gectgeagga teggggtctg 660 
cagccaccat gcctccagag tccccggctc cctgccaagg aggacaagga geegggtgag 720 
gactgcccga gcccccagcc agctccagcc tctccccggg actccctagc tctgggcagg 780 . 
gcagaccccg gtgccccggt ttcccaggaa gaeatgeagg cgatggtgca actcatacgc 840 
tacatgeaca cctactgcct cccccagagg aagctgcccc cacagacccc tgagccactc 900 
cccaaggcct gcagcaaccc ctcccagcag gtcagatccc ggccctggtc ccggcaccac 960 
tccaaagcct cctgggctga gttctccatt ctgagggaac ttctggctca agaegtgetc 1020 
tgtgatgtca gcaaacccta ccgtctggcc acgcctgttt atgcctccct cacacctcgg 1080 
tcaaggccca ggccccccaa agacagtcag gcctcccctg gtcgcccatc ctcggtggag 1140 
gaggtaagga tcgcagcttc acccaagagc accgggccca gaccaagcct gcgcccactg 1200 
eggctggagg tgaaaaggga ggtccgccgg cctgccagac tgeagcagea ggaggaggaa 1260 
gacgaggaag aagaggagga ggaagaggaa gaagaaaaag aggaggagga ggagtggggc 1320 
aggaaaaggc caggecgagg cctgccatgg acgaagctgg ggaggaagct ggagagctct 1380 
gtgtgccccg tgcggcgttc teggagactg aaccctgagc tgggcccctg gctgacattt 1440 
gcagatgagc cgctggtccc ctcggagccc caaggtgetc tgccctcact gtgcctggct 1500 
cccaaggcct aegaegtaga gcgggagctg ggcagcccca eggacgagga cagtggccaa 1560 
gaccagcagc tectaegggg accccagatc cctgccctgg agagcccctg tgagagtggc 1620 
gacccaactt ttggcaagaa gagctttgag cagaccttga cagtggagct ctgtggcaca 1680 
gcaggtgagc cagggggctt ccactggcag gtgccttcag gaaaacaccc gtgeatctet 1740 
gagtttttca teatgeatgg gcaaggactc accccaccca ccacaccacc gtacaagccc 1800 
acagaggagg atcccttcaa accagacatc aagcatagtc taggcaaaga aatagctctc 1860 
agcctcccct cccctgaggg cctctcactc aaggccaccc caggggctgc ccacaagctg 1920 
ccaaagaagc acccagagcg aagtgagctc ctgtcccacc tgcgacatgc cacagcccag 1980 
ccagcctccc aggctggeca gaagcgtccc ttctcctgtt cctttggaga ccatgactac 2040 
tgccaggtgc tccgaccaga aggegtcctg caaaggaagg tgctgaggtc ctgggagccg 2100 
tctggggttc accttgagga ctggccccag cagggtgccc cttgggctga ggcacaggcc 2160 
cctggcaggg aggaagacag aagctgtgat gctggcgccc cacccaagga cagcacgctg 2220 
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ctgagagacc atgagatccg tgccagcctc accaaacact ttgggctgct ggagaccgcc 2280 

ctggaggagg aagacctggc ctcctgcaag agccctgagt atgacactgt ctttgaagac 2340 

agcagcagca gcagcggcga gagcagcttc ctcccagagg aggaagagga agaaggggag 2400 

gaggaggagg aggacgatga agaagaggac tcaggggtca gccccacttg ctctgaccac 24 60 

tgcccctacc agagcccacc aagcaaggcc aaccggcagc tctgttcccg cagccgctca 2520 

agctctggct cttcaccctg ccactcctgg tcaccagcca ctcgaaggaa cttcagatgt 2580 

gagagcagag ggccgtgttc agacagaacg ccaagcatcc ggcacgccag gaagcggcgg 2640 

gaaaaggcca ttggggaagg ccgcgtggtg tacattcaaa atctctccag cgacatgagc 2700 

tcccgagagc tgaagaggcg ctttgaagtg tttggtgaga ttgaggagtg cgaggtgctg 2760 

acaagaaata ggagaggcga gaagtacggc ttcatcacct accggtgttc tgagcacgcg 2820 

gccctctctt tgacaaaggg cgctgccctg aggaagcgca acgagccctc cttccagctg 2880 

agctacggag ggctccggca cttctgctgg cccagataca ctgactacga ttccaattca 2940 

gaagaggccc ttcctgcgtc agggaaaagc aagtatgaag ccatggattt tgacagctta 3000 

ctgaaagagg cccagcagag cctgcattga 3030 

<210> 11 
<211> 1009 
<212> PRT 

<213> Homo sapiens 
<400> 11 

Met Pro Pro Val Tyr Ala Ser Glu Tyr Val Leu Pro Leu Gin Gly Gly 

1 5 10 15 

Gly Ser Gly Glu Glu Gin Leu Tyr Ala Asp Phe Pro Glu Leu Asp Leu 

20 25 30 

Ser Gin Leu Asp Ala Ser Asp Phe Asp Ser Ala Thr Cys Phe Gly Glu 

35 40 45 

Leu Gin Trp Cys Pro Glu Asn Ser Glu Thr Glu Pro Asn Gin Tyr Ser 

50 55 60 

Pro Asp Asp Ser Glu Leu Phe Gin lie Asp Ser Glu Asn Glu Ala Leu 
65 70 75 80 

Leu Ala Glu Leu Thr Lys Thr Leu Asp Asp lie Pro Glu Asp Asp Val 

85 90 95 

Gly Leu Ala Ala Phe Pro Ala Leu Asp Gly Gly Asp Ala Leu Ser Cys 

100 105 110 

Thr Ser Ala Ser Pro Ala Pro Ser Ser Ala Pro Pro Ser Pro Ala Pro 

115 120 125 

Glu Lys Pro Ser Ala Pro Ala Pro Glu Val Asp Glu Leu Ser Leu Leu 

130 135 140 

Gin Lys Leu Leu Leu Ala Thr Ser Tyr Pro Thr Ser Ser Ser Asp Thr 
145 150 155 160 

Gin Lys Glu Gly Thr Ala Trp Arg Gin Ala Gly Leu Arg Ser Lys Ser 

165 170 " 175 

Gin Arg Pro Cys Val Lys Ala Asp Ser Thr Gin Asp Lys Lys Ala Pro 

180 ^ * 185 190 

Met Met Gin Ser Gin Ser Arg Ser Cys Thr Glu Leu His Lys His Leu 

195 200 205 

Thr Ser Ala Gin Cys Cys Leu Gin Asp Arg Gly Leu Gin Pro Pro Cys 

210 215 220 

Leu Gin Ser Pro Arg Leu Pro Ala Lys Glu Asp Lys Glu Pro Gly Glu 
225 230 235 240 

Asp Cys Pro Ser Pro Gin Pro Ala Pro Ala Ser Pro Arg Asp Ser Leu 

245 250 255 

Ala Leu Gly Arg Ala Asp Pro Gly Ala Pro Val Ser Gin Glu Asp Met 

260 265 270 

Gin Ala Met Val Gin Leu lie Arg Tyr Met His Thr Tyr Cys Leu Pro 

275 280 285 

Gin Arg Lys Leu Pro Pro Gin Thr Pro Glu Pro Leu Pro Lys Ala Cys 

290 295 300 

Ser Asn Pro Ser Gin Gin Val Arg Ser Arg Pro Trp Ser Arg His His 
305 310 315 320 
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<210> 12 

<211> 3664 

<212> DNA 

<213> Mus musculus 

<400> 12 

ctcgctccct cccccgggcg ggctcggcgc 
gaagatggcg gggaacgact gcggcgcgct 
caactatctc tctgacacgc agggtgggga 
gecagagett gacctctccc agetggaege 
ggagctgcag tggtgcccgg agacctcaga 
ctccgagctc ttccagattg acagtgagaa 
cctggatgac atccccgaag acgatgtggg 
cgacacacca tcctgcaccc cagcctcacc 
cctggagagg cttctgtccc cagegtctga 
cctcctggcc acatcctccc caacagcaag 
gtcccagacc agectcagtt ccagaagtca 
ggataagaag acccccacac tgegggctea 
cctcacttcg gtgctgccct gtcccagagt 
ccctcggctc ctctccaaag aggaggagga 
gccgactcca gcctcgcccc aagactccct 
ccagcctccc gaggaggatg tgagggecat 
ctgcctgcct cagaggaagc tgccccaacg 
cagcctctcc aggcaggttc aaccccgatc 
gttctctatc ctaagggaac ttctggccca 
ccgcctggcc atacctgtct atgcttccct 
ggacagtcag gcctcccctg cccactctgc 
ccccaagagc accgggccta gacccagcct 
tgttaacaag cctacaaggc aaaagcggga 
agaagaggaa gaagaaaaag aagaggaaga 
tggcctgcca tggaccaaca tagggaggaa 
ctccaggaga ctgaatccag agctgggtcc 



tgactccgcc gcacgctgca gccgcggctg 60 

gctggatgaa gagctctcgt ccttcttcct 120 

ctctggagag gaacagctgt gtgetgaett 180 

cagtgacttt gactcagcca cgtgctttgg 24 0 

gacagagccc agecagtaca gccccgatga 300 

tgaagctctc ttggctgcgc ttacgaagac 360 

gctggctgcc ttcccagaac tggatgaagg 420 

tgccccctta tctgcacccc ccagccccac 480 

cgtggacgag ctttcactgc tacagaagct 540 

ctctgacgct ctgaaggacg gggccacctg 600 

gcggccttgt gtcaaggtgg atggcaccca 660 I 

gagccggcct tgtaeggaac tgeataagea 720 

gaaagcctgc tccccaactc cgcacccgag 780 

ggaggtgggg gaggattgee caagcccttg 840 

agcacaggac acggccagcc ccgacagtgc 900 

ggtacagctc attegctaca tgcataccta 960 

ggccccagag ccaatccccc aggectgeag 1020 

ccggcatccc cccaaagcct tctggactga 1080 

agatatcctc tgtgatgtta gcaagcccta 1140 

cacacctcag tccaggccca ggccccccaa 1200 

catggcagaa gaggtgagaa tcactgcttc 1260 

gcgtcctctg aggctggagg tgaaacggga 1320 

ggaagatgag gaggaggagg aggaagaaga 1380 

agaggagtgg ggcaggaaga gaccaggtcg 14 40 

gatggacagc tccgtgtgcc ccgtgcggcg 1500 

ctggctgaca ttcactgatg ageccttagg 1560 
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tcctctgccc tcgatgtgcc tggatacaga gacccacaac ctggaggaag acctgggcag 1620 
cctcacagac agtagtcaag gccggcagct cccccaggga tcccagatcc ccgccctgga 1680 
aagcccctgt gagagtgggt gcggagacac agatgaagat ccaagctgcc cacagcccac 1740 
ttccagagac tcctccaggt gcctcatgct ggccttgtca caaagcgact ctcttggcaa 1800 
gaagagcttt gaggagtccc tgacggtgga gctttgcggc acggcaggac tcacgccacc 1860 
caccacacct ccatacaagc caatggagga ggaccccttc aagccagaca ccaagctcag 1920 
cccaggccaa gacacagctc ccagccttcc ctcccccgag gctcttccgc tcacagccac 1980 
cccaggagct tcccacaagc tgcccaagag gcacccagag cgaagcgagc tcctgtccca 2040 
tttgcagcat gccacaaccc aaccagtctc acaggctggc cagaagcgcc ccttctcctg 2100 
ctcctttgga gaccacgact actgccaggt gctcaggcca gaggctgccc tgcagaggaa 2160 
ggtgctgcgg tcctgggagc caatcggggt ccaccttgaa gacttggccc agcagggtgc 2220 
ccctctgcca acggaaacaa aggcccctag gagggaggca aaccagaact gtgaccctac 2280 
ccacaaggac agcatgcagc taagagacca tgagatccgt gccagtctca caaagcactt 2340 
tgggctgctg gagactgctc tggaaggtga agacctggcg tcctgtaaaa gcccggagta 2400 
tgacaccgta tttgaggaca gcagcagcag cagtggcgag agtagcttcc tgcttgagga 2460 
ggaggaggaa gaggaggagg gaggggaaga ggacgatgaa ggagaggact caggggtcag 2520 
ccctccctgc tctgatcact gcccctacca gagcccaccc agtaaggcca gtcggcagct 2580 
ctgctcccga agccgctcca gttccggctc ctcgtcctgc agctcctggt caccagccac 2640 
ccggaagaac ttcagacgtg agagcagagg gccctgttca gatggaaccc caagcgtccg 2700 
gcatgccagg aagcggcggg aaaaggccat cggtgaaggc cgtgtggtat acattcgaaa 2760 
tctctccagt gacatgagct ctcgggaact aaagaagcgc tttgaggtgt tcggtgagat 2820 
tgtagagtgc caggtgctga cgagaagtaa aagaggccag aagcacggtt ttatcagctt 2880 
ccggtgttca gagcacgctg ccctgtccgt gaggaacggc gccaccctga gaaagcgcaa 2940 - 
tgagccctcc ttccacctga gctatggagg gctccggcac ttccgttggc ccagatacac 3000 
tgactatgat cccacatctg aggagtccct tccctcatct gggaaaagca agtacgaagc 3060 
catggatttt gacagcttac tgaaagaggc ccagcagagc ctgcattgat atcagcctta 3120 
accttcgagg aatacctcaa tacctcagac aaggcccttc caatatgttt acgttttcaa 3180 
agaaaagagt atatgagaag gagagcgagc gagcgagcga gcgagcgagt gagcgtgaga 3240 
gatcacacag gagagagaaa gacttgaatc tgctgtcgtt tcctttaaaa aaaaaaaaac 3300 
gaaaaacaaa aacaaatcaa tgtttacatt gaacaaagct gcttccgtcc gtctgtccgt 3360 
ccgtccgtcc gtccgtgagt taccattctg atgatgttcc actgccacgt tagcgtcgtc 3420 
ctcgcttcca gcggatcgtc ctgggtgcgc ctccaagtgc tgtcagtcgt cctctgcccc 3480 
tcccacccga ctgacttcct tctgttagac ttgagctgtg ttcacataac atcttctgtc 3540 
tgtagagtgt gatgatgaca ttgttacttg tgaatagaat caggagttag aaactcattt 3600 
ttaattgaag aaaaaaaaag tatatcctta aaaagaaaaa aaaaaaaaca aatgtaaaaa 3660 
aaaa 3664 
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Ser Pro Ala Ser Asp Val Asp Glu Leu Ser 
145 150 
Leu Ala Thr Ser Ser Pro Thr Ala Ser Ser 
165 170 
Ala Thr Trp Ser Gin Thr Ser Leu Ser Ser 

180 185 
Val Lys Val Asp Gly Thr Gin Asp Lys Lys 

195 200 
Gin Ser Arg Pro Cys Thr Glu Leu His Lys 

210 215 
Pro Cys Pro Arg Val Lys Ma Cys Ser Pro 
225 230 
Arg Leu Leu Ser Lys Glu Glu Glu Glu Glu 
245 250 
Ser Pro Trp Pro Thr Pro Ala Ser Pro Gin 

260 265 
Thr Ala Ser Pro Asp Ser Ala Gin Pro Pro 

275 280 
Met Val Gin Leu lie Arg Tyr Met His Thr 

290 295 
Lys Leu Pro Gin Arg Ala Pro Glu Pro lie 
305 310 
Leu Ser Arg Gin Val Gin Pro Arg Ser Arg 
325 330 
Trp Thr Glu Phe Ser lie Leu Arg Glu Leu 

340 345 
Cys Asp Val Ser Lys Pro Tyr Arg Leu Ala 
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Leu Thr Pro Gin Ser Arg Pro Arg Pro Pro 
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Pro Ala His Ser Ala Met Ala Glu Glu Val 
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Lys Ser Thr Gly Pro Arg Pro Ser Leu Arg 
405 410 
Lys Arg Asp Val Asn Lys Pro Thr Arg Gin 
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Glu Glu Glu Trp Gly Arg Lys Arg Pro Gly 

450 455 
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Leu Glu Glu Asp Leu Gly Ser Leu Thr Asp 
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Leu Pro Gin Gly Ser Gin lie Pro Ala Leu 
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Gly Cys Gly Asp Thr Asp Glu Asp Pro Ser 
545 550 
Arg Asp Ser Ser Arg Cys Leu Met Leu Ala 
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625 630 635 640 

Gly Ala Ser His Lys Leu Pro Lys Arg His Pro Glu Arg Ser Glu Leu 

645 650 " 655 

Leu Ser His Leu Gin His Ala Thr Thr Gin Pro Val Ser Gin Ala Gly 

660 665 670 

Gin Lys Arg Pro Phe Ser Cys Ser Phe Gly Asp His Asp Tyr Cys Gin 

675 680 685 

Val Leu Arg Pro Glu Ala Ala Leu Gin Arg Lys Val Leu Arg Ser Trp 

690 695 700 

Glu Pro lie Gly Val His Leu Glu Asp Leu Ala Gin Gin Gly Ala Pro 
7 °5 710 715 720 

Leu Pro Thr Glu Thr Lys Ala Pro Arg Arg Glu Ala Asn Gin Asn Cys 

725 730 735 

Asp Pro Thr His Lys Asp Ser Met Gin Leu Arg Asp His Glu He Arg 

.740 745 750 

Ma Ser Leu Thr Lys His Phe Gly Leu Leu Glu Thr Ala Leu Glu Gly 

755 760 765 

Glu Asp Leu Ala Ser Cys Lys Ser Pro Glu Tyr Asp Thr Val Phe Glu 

770 775 780 

Asp Ser Ser Ser Ser Ser Gly Glu Ser Ser Phe Leu Leu Glu Glu Glu 
78 5 790 795 800 

Glu Glu Glu Glu Glu Gly Gly Glu Glu Asp Asp Glu Gly Glu Asp Ser 

805 810 ~ 815 

Gly Val Ser Pro Pro Cys Ser Asp His Cys Pro Tyr Gin Ser Pro Pro 

820 • 825 830 

Ser Lys Ala Ser Arg Gin Leu Cys Ser Arg Ser Arg Ser Ser Ser Gly 

835 840 845 

Ser Ser Ser Cys Ser Ser Trp Ser Pro Ala Thr Arg Lys Asn Phe Arg ■ ' 

850 855 860 

Arg Glu Ser Arg Gly Pro Cys Ser Asp Gly Thr Pro Ser Val Ara His 
865 . 870 875 - 880 

Ala Arg Lys Arg Arg Glu Lys Ala He Gly Glu Gly Arg Val Val Tyr 

885 890 * 895 

He Arg Asn Leu Ser Ser Asp Met Ser Ser Arg Glu Leu Lys Lys Arg 

900 905 910 

Phe Glu Val Phe Gly Glu He Val Glu Cys Gin Val Leu Thr Arg Ser 

915 920 925 

Lys Arg Gly Gin Lys His Gly Phe He Thr Phe Arg Cys Ser Glu His 

930 935 940 

Ala Ala Leu Ser Val Arg Asn Gly Ala Thr Leu Arg Lys Arg Asn Glu 
945 950 955 ~ 960 ' 

Pro Ser Phe His Leu Ser Tyr Gly Gly Leu Arg His Phe Arg Trp Pro 

965 970 975 

Arg Tyr Thr Asp Tyr Asp Pro Thr Ser Glu Glu Ser Leu Pro Ser Ser 

980 985 990 

Gly Lys Ser Lys Tyr Glu Ala Met Asp Phe Asp Ser Leu Leu Lys Glu 

995 1000 1005 

Ala Gin Gin Ser Leu His 
1010 
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<210> 15 
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